Abstract
Accurate, timely, and resource-efficient decision-making is critical for sustainable precision agriculture. This paper proposes an agentic AI-based Internet of Things (IoT) framework that enables coordinated, closed-loop perception–decision–action processes across heterogeneous sensing and actuation components. The framework models agricultural systems as distributed collections of goal-driven agents responsible for multimodal sensing, uncertainty-aware reasoning, and adaptive decision-making. To provide a structured foundation, the proposed architecture is formalized within a Multi-Agent Partially Observable Markov Decision Process (MPOMDP) perspective, enabling systematic treatment of coordination, uncertainty, and decision policies. The framework integrates multimodal information sources, including vision-based perception and environmental sensing, and defines mechanisms for their fusion and use in system-level decision-making. A proof-of-concept instantiation is presented using publicly available datasets, combining visual perception models and tabular reasoning models within the proposed agentic workflow. The experiments are designed to demonstrate the feasibility, modularity, and coordination capabilities of the framework, rather than to benchmark predictive performance or provide field-validated evaluation. The results illustrate how multimodal information can be integrated to support adaptive and resource-aware decision processes. Finally, the paper discusses key challenges and outlines directions for future work, including real-world deployment, integration with physical actuation systems, and validation under operational conditions.
1. Introduction
Global agriculture faces escalating pressure to deliver sustainable, safe, and nutritious food for a population projected to exceed 9.5 billion by 2050, under increasingly volatile climatic conditions, resource scarcity, and labor shortages [1,2,3]. Traditional farming practices, characterized by uniform input application, manual monitoring, and reactive management, are increasingly inadequate to meet these challenges at scale. Consequently, agriculture is undergoing a digital transformation commonly referred to as Agriculture 4.0 [4,5,6], driven by the convergence of the Internet of Things (IoT) [7,8], artificial intelligence (AI) [9,10], robotics [11], and cyber-physical systems [12].
Within this paradigm, AI-enabled precision agriculture (PA) has demonstrated measurable benefits in yield optimization [13], water conservation [14], and input efficiency [15]. Existing PA systems commonly integrate soil sensors, weather stations, satellite or UAV imagery, and machinery telemetry to support data-driven decision-making for irrigation, fertilization, crop selection, and pest management [16,17,18]. Machine learning and deep learning models, such as multilayer perceptrons, random forests, convolutional neural networks, and recurrent architectures, have been successfully applied for crop recommendation, yield prediction, disease detection, and irrigation scheduling [19,20,21]. Field deployments report reductions in water usage, fertilizer application, and pesticide consumption while improving productivity and sustainability outcomes [22]. Recognizing the strategic importance of AI-enabled agriculture, major research institutions and governmental bodies, including the U.S. Department of Agriculture’s National Institute of Food and Agriculture (USDA NIFA), have invested significantly, supporting multiple AI-focused research institutes since 2020 to accelerate innovation across agricultural systems [1].
However, the majority of current AI-based PA solutions remain decision-support oriented, where analytics pipelines generate recommendations that require human interpretation and execution. These architectures are typically centralized, rule-based, and episodic, operating on periodic data acquisition cycles rather than continuous feedback loops [23,24,25]. Such limitations constrain responsiveness in highly dynamic agricultural environments, where rapid shifts in weather, pest populations, and crop stress conditions demand autonomous, context-aware interventions.
Recent advances in agentic AI introduce a fundamentally different approach. Agentic AI systems are characterized by autonomous perception, goal-driven reasoning, adaptive decision-making, action execution, and continuous learning from environmental feedback [26,27]. In agricultural contexts, this paradigm enables the transition from passive analytics to closed-loop perception–decision–action systems, where AI agents not only analyze data but also coordinate sensing, planning, and actuation in real time.
Parallel to these developments, unmanned aerial vehicles (UAVs) have emerged as critical components of precision agriculture, particularly for crop monitoring and pesticide management [28,29,30]. UAVs equipped with multispectral and thermal sensors enable high-resolution assessment of crop health, stress, and pest infestation [31]. At the same time, precision spraying systems allow for localized and variable-rate pesticide application. Studies consistently demonstrate that UAV-based pesticide management can significantly reduce chemical usage, environmental contamination, and operational costs compared to conventional blanket spraying methods [32]. Nonetheless, most UAV deployments remain manually controlled or rely on pre-planned missions, limiting adaptability and real-time responsiveness.
Despite growing interest in autonomous agricultural systems, the systematic integration of agentic AI with IoT sensing infrastructures, UAV-based actuation, and sustainability constraints remains underexplored. Existing studies often focus on isolated components, such as sensor networks, predictive models, or drone operations, without a unifying orchestration framework capable of dynamic, distributed, and goal-driven coordination across the agricultural lifecycle [33,34], while recent work has begun to address individual advances in agricultural intelligence and decision support, these efforts typically emphasize data analysis methods or specific actuation mechanisms rather than holistic agentic autonomy [35].
For example, recent surveys highlight the ongoing shift in agricultural AI toward multimodal information fusion [36], where emerging multimodal language models aim to integrate heterogeneous data such as imagery, sensor readings, and textual knowledge for enhanced farm-level decision-making [37,38,39]. However, these methods remain largely focused on data representation and inference efficiency rather than closed-loop autonomous control architectures that blend continuous perception with actuation strategies in real environments [26]. Similarly, research on agentic AI in precision agriculture has begun to articulate high-level architectures that connect cognitive reasoning layers with physical actuation devices such as irrigation systems, autonomous sprayers, and drones, showing how adaptive reasoning agents can set objectives and translate them into contextual actions on the farm. These works confirm the potential of agentic designs to improve responsiveness, resource efficiency, and sustainability, but they stop short of formalizing an explicit, multi-layer orchestration framework capable of coordinating heterogeneous agents across sensing, reasoning, decision, and actuation domains.
AgroAskAI introduces a multi-agent reasoning framework for climate adaptation that coordinates specialized agents to provide context-aware and grounded advisory outputs, although it focuses primarily on interactive decision support rather than full operational integration [40]. Similarly, agentic AI approaches for autonomous soil and fertilization management combine IoT sensing, predictive modeling, and multi-agent reinforcement learning to enable bounded autonomy, demonstrating improved resource efficiency while embedding agronomic safety constraints [41]. Complementary work on smart agriculture decision-support systems integrates multiple AI models for soil analysis, crop recommendation, and fertilizer prediction, enhanced with explainable AI techniques, but remains largely a modular pipeline without explicit multi-agent coordination [42].
Similarly, conceptual agent-based architectures have been proposed to bridge traditional farming and AI intelligence by coordinating multiple specialized agents for soil, climate, and vision sensing and even integrating natural language interaction via chatbot interfaces; however, such work remains largely at the prototype level and does not formalize distributed control, time-critical coordination, or the explicit modular layers necessary for real-world precision agriculture deployments [43]. In controlled environments, hybrid IoT-agent architectures such as AgroNova [44] and agent-based service frameworks with RAG-grounded LLM agents [45] demonstrate the effectiveness of combining edge-level autonomy, cloud-based reasoning, and policy-constrained actuation, achieving robust and low-latency operation in greenhouse settings.
Recent work on edge-enabled smart agriculture frameworks integrates IoT infrastructures with lightweight deep learning models and agentic AI to support context-aware decision-making directly at the edge [46]. Such approaches emphasize reduced latency, bandwidth efficiency, and on-device intelligence, demonstrating the feasibility of distributed inference in resource-constrained environments. However, these systems primarily optimize edge computation and responsiveness while providing limited formalization of multi-agent orchestration across sensing, reasoning, coordinated actuation, and sustainability-aware control at the system level.
Finally, research on multi-agent coordination for drone swarms highlights the importance of decentralized control and adaptive planning for agricultural robotics, though it focuses on low-level coordination rather than high-level decision-making. Overall, while these works demonstrate the potential of agentic AI across specific applications, they remain fragmented, motivating the need for a unified conceptual framework that integrates perception, reasoning, coordination, and action within IoT-enabled precision agriculture systems [47].
In addition to these architectural aspects, several critical system-level dimensions remain insufficiently addressed in existing work. First, while many frameworks focus on improving prediction accuracy or system responsiveness, they rarely consider the operational energy footprint of AI-driven agricultural systems, particularly in edge-enabled deployments where computational efficiency directly impacts sustainability. Second, most approaches lack an explicit mathematical formalization of agent coordination and decision-making, limiting the ability to systematically reason about uncertainty, system behavior, and optimization objectives. Third, although autonomous actuation is increasingly explored, relatively little attention is given to safety and governance mechanisms, such as constraint-aware decision policies, regulatory compliance, and human-on-the-loop control. These dimensions are essential for transitioning from conceptual or decision-support systems toward reliable, scalable, and deployable autonomous agricultural solutions. To provide a structured comparison of these approaches, Table 1 summarizes key architectural and functional characteristics of recent agentic AI-based agricultural frameworks.
Table 1.
Comparative analysis of recent agentic AI-based agricultural frameworks and the proposed approach. Check marks (✓) denote features that are explicitly incorporated in a given framework, while cross marks (×) denote features that are absent, unsupported, or not clearly specified in the referenced work.
The columns in Table 1 capture key system capabilities. Multimodal denotes the use of heterogeneous data sources, while Fusion refers to their integration into a unified representation. Agents indicates distributed, interacting components. Closed-Loop denotes systems that connect perception, decision-making, and actuation. Energy-Aware reflects explicit consideration of computational and operational efficiency. Feedback captures adaptation based on outcomes. Formal model indicates the presence of an explicit mathematical framework, and Safety refers to constraint-aware and human-supervised decision mechanisms.
Table 1 highlights that existing approaches primarily focus on sensing, analysis, or decision support. By contrast, the framework proposed in this paper unifies these strands by explicitly aligning the IoT monitoring layer, multimodal perception and reasoning agents, and coordinated action planning within a single, distributed agentic architecture. This integration supports real-time evidence fusion, adaptive model selection, and energy-aware coordination, while also incorporating safety mechanisms and a rigorous mathematical formalization of agent coordination and decision-making. As a result, the proposed framework addresses a critical gap in the literature, where predictive intelligence models are rarely combined with decentralized decision cycles, autonomous actuation, and formally grounded control in sustainable field-level interventions.
The primary objective of this paper is to present a comprehensive agentic AI-based IoT precision agriculture framework that articulates both architectural design and operational principles. The paper further examines the technical, economic, and deployment challenges associated with AI-driven autonomy in agriculture, including data quality, connectivity, system reliability, hardware integration, and scalability. Particular attention is given to sustainability considerations, aligning the proposed model with net-positive AI energy principles, whereby the environmental and resource savings enabled by AI outweigh the energy costs of model training and deployment.
The contributions of this paper are as follows:
- 1.
- We conceptualize a fully agentic AI architecture for precision agriculture, clearly distinguishing it from conventional AI-driven and decision-support PA systems.
- 2.
- We provide a mathematical formalization of the proposed framework within a Multi-Agent Partially Observable Markov Decision Process (MPOMDP), enabling a structured representation of agent coordination, belief-based reasoning, uncertainty handling, and decision-making under real-world constraints.
- 3.
- We formalize a reference multi-agent orchestration workflow for crop protection, specifying functional roles, coordination mechanisms, safety constraints, and closed-loop perception–decision–action logic independent of any specific implementation.
- 4.
- We provide a proof-of-concept multimodal instantiation of the proposed framework using publicly available datasets, integrating vision-based perception agents and environmental reasoning agents to demonstrate autonomous evidence fusion and adaptive decision-making.
- 5.
- We identify and analyze technical, operational, and sustainability challenges specific to deploying agentic AI in real agricultural environments.
- 6.
- We align the proposed agentic AI-based PA model with net-positive AI energy principles, highlighting resource efficiency, reduced chemical inputs, and sustainable operational pathways.
The remainder of the paper is organized as follows. Section 2 presents the proposed agentic AI-based IoT Precision Agriculture framework, including the conceptual architecture, its mathematical formalization, and the reference multi-agent workflow for crop protection. Section 3 provides a proof-of-concept multimodal instantiation using publicly available visual and environmental datasets to demonstrate agent coordination and closed-loop reasoning. Subsequently, in this section, we also discuss the experimental observations, practical deployment considerations, and current limitations. Finally, Section 4 summarizes the contributions and outlines directions for future research.
2. Agentic AI-Based IoT Precision Agriculture Framework
In this section, we present the proposed agentic AI-Based IoT Precision Agriculture (PA) framework, illustrated in Figure 1, by systematically integrating the challenges of conventional AI-based IoT PA systems with the emerging capabilities of agentic AI. We introduce an agent orchestration framework for pesticide management, detailing the architectural components, operational principles, and required technical infrastructure, including unmanned aerial vehicles (UAVs).
Figure 1.
Overview of the proposed agentic AI-based IoT precision agriculture framework. The system follows a layered workflow comprising: (i) multimodal data acquisition from diverse sources, including UAV imagery, sensor measurements, satellite data, and weather information; (ii) perception and processing through data cleaning, transformation, and multimodal fusion; (iii) coordinated decision-making via interacting agentic components that operate on context-aware representations; (iv) action execution through automated agricultural interventions; and (v) feedback and learning mechanisms that enable performance monitoring and adaptive policy updates. The architecture is complemented by an ethical and governance foundation addressing privacy, fairness, transparency, and security.
2.1. Agentic AI Precision Agriculture: Conceptual Clarification
While agentic AI has gained increasing attention in recent literature, it is often conflated with automated machine learning pipelines, rule-based decision engines, or classical multi-agent systems. To avoid ambiguity and to clearly position our contribution, we formally define agentic AI in the context of precision agriculture.
Definition 1 (Agent AI in Precision Agriculture).
An agentic AI system is a collection of autonomous, goal-driven agents capable of perceiving environmental states, deliberating under partial observability, making decisions, executing physical or digital actions, learning from feedback, and coordinating with other agents continuously, with minimal or optional human intervention.
In this work, the term agent refers to a functional component within a distributed decision-making architecture, responsible for specific perception, reasoning, or decision-making tasks based on multimodal data. In contrast to traditional AI-driven PA solutions, agentic AI systems do not merely analyze data or generate recommendations. Instead, they close the perception–decision–action loop, enabling autonomous execution through connected robotic and IoT devices.
To further clarify this distinction, Table 2 contrasts conventional AI-based PA systems with agentic AI-based PA systems. This distinction is fundamental to the proposed model, as the shift from centralized, recommendation-driven systems to distributed, autonomous agents enables real-time responsiveness, scalability, and long-term adaptability in dynamic agricultural environments.
Table 2.
Comparison between classical AI-based PA and agentic AI-based PA.
2.2. Formal Modeling of Agent Coordination and Decision-Making
To establish a rigorous theoretical foundation for our agentic AI-driven IoT architecture, we mathematically model the multi-agent system as a Multi-Agent Partially Observable Markov Decision Process (MPOMDP) [48,49]. This formulation enables a structured description of agent perception, uncertainty handling, coordination, and decision-making under real-world constraints. Formally, the agentic precision agriculture system is defined as:
where:
- denotes the set of agents, including perception, reasoning, decision-making, and planning agents;
- represents the latent global state of the agricultural environment, including crop health, pest presence, and environmental conditions;
- denotes the observation space of agent i, derived from multimodal sources such as UAV imagery and IoT sensor streams;
- denotes the action space of agent i (e.g., UAV flight trajectories, localized activations of spraying nozzles);
- defines the stochastic transition dynamics of the environment, where is the joint action at decision step t;
- is a shared reward function capturing system-level objectives (e.g., net-positive AI energy principle).
The system operates in discrete decision steps indexed by t, where agents observe, reason, and act sequentially.
Due to the inherent uncertainty and incomplete sensing in agricultural environments, agents do not have direct access to the true state . Instead, the system maintains a belief over the state:
To further ground the treatment of uncertainty, observations can be interpreted as stochastic outputs of an underlying observation process. In this context, an observation function can be defined as:
which represents the probability of observing given the latent state and the previous action . Within the proposed framework, this formulation provides an abstract interpretation of AI model outputs (e.g., detection scores or probabilistic predictions) as uncertainty-aware observations, without requiring explicit modeling of the observation process.
In the proposed framework, multimodal belief estimation is performed through the integration of:
- visual perception agents, producing spatially localized disease likelihoods;
- environmental reasoning agents, providing probabilistic assessments based on climatic conditions.
These outputs are fused into a shared belief representation:
which serves as the basis for downstream decision-making.
Each agent follows a policy:
mapping local observations to actions, while sensing and local inference are decentralized (each perception agent processes local observations), the proposed system adopts a hybrid coordination strategy, where a centralized or hierarchical decision-making process operates over the fused belief state:
This formulation is consistent with the safety-constrained execution mechanism defined in the subsequent section, where the selected action is validated prior to execution.
Agent behavior is guided by a global objective that balances agricultural effectiveness with resource efficiency and sustainability; for example, it can be defined as:
where
- measures crop protection effectiveness.
- represents chemical usage associated with action .
- captures computational and operational energy costs done by; for example, UAV flights and edge-AI inference.
- are trade-off coefficients.
This formulation aligns with the concept of net-positive AI operation introduced in this work, where resource savings outweigh the energy costs of AI-driven decision-making. The presented formulation does not assume explicit multi-agent reinforcement learning or optimal policy computation. Instead, it provides a conceptual and mathematical grounding for the proposed architecture, enabling systematic reasoning about uncertainty, coordination, and decision-making. Future work will explore learning-based policy optimization and decentralized coordination strategies within this formal framework.
2.3. System Interoperability and Human-on-the-Loop Governance
To transition this framework from conceptual integration to a deployable architecture, we explicitly define the communication protocols enabling heterogeneous multi-agent coordination alongside the safety mechanisms required for high-risk autonomous actuation. While specific protocols are identified to ground the design, the framework remains inherently extensible and can be readily adapted to alternative standards and communication technologies without affecting the overall architecture. To enable seamless coordination among heterogeneous agents and devices originating from different vendors, the framework adopts a tiered communication strategy that ensures frictionless interoperability across vendor-agnostic hardware. The proposed architecture defines standardized interfaces, messaging protocols, and data exchange formats that support reliable and low-latency interaction between sensing, reasoning, and actuation components.
The framework adopts a hybrid communication strategy aligned with real-world deployment constraints. For continuous data ingestion and coordination across heterogeneous components, monitoring agents are designed to utilize MQTT (Message Queuing Telemetry Transport) as a lightweight, asynchronous communication protocol [50]. MQTT supports the exchange of diverse data streams, including environmental sensor measurements, UAV telemetry, robotic platform status, and intermediate outputs from perception and reasoning agents [51]. Its publish-subscribe model enables scalable, decoupled, and fault-tolerant communication between IoT sensors, edge devices, and higher-level agents, even under limited or intermittent connectivity conditions typical of agricultural environments. In addition to sensor data, MQTT facilitates the dissemination of event-driven messages, such as detected anomalies, disease alerts, and task triggers, allowing perception and reasoning agents to dynamically influence downstream decision-making and planning processes.
For real-time coordination of robotic agents and actuation processes, the framework is designed to leverage ROS2 (Robot Operating System 2) [52], underpinned by the Data Distribution Service (DDS), as a representative solution for distributed, real-time communication. ROS2 enables deterministic communication, distributed execution, and native support for multi-robot systems, making it well-suited for time-critical operations such as UAV navigation and autonomous spraying. In this context, ROS2 facilitates decentralized, peer-to-peer communication between heterogeneous physical agents, including aerial UAVs and ground-based robotic platforms. For example, when perception agents detect critical disease symptoms, spatial information and task directives are propagated through ROS2/DDS, allowing robotic agents to share coordinates, coordinate trajectories, and negotiate obstacle avoidance in real time without reliance on a centralized cloud infrastructure.
At a higher level, REST/HTTP or gRPC APIs are intended to support system orchestration, monitoring, and integration with external platforms such as farm management systems. These interfaces enable interaction with user-facing applications, data repositories, and decision-support tools, ensuring seamless integration of the agentic system within existing agricultural digital ecosystems.
To ensure consistency across heterogeneous data sources, all exchanged information is designed to follow standardized data schemas and formats. These include geospatial representations such as GeoJSON and raster tiles for spatial field data, time-series formats (e.g., timestamped JSON or CSV streams) for environmental and IoT measurements, and structured semantic annotations for perception outputs, including bounding boxes, class labels, and confidence scores. This standardization is intended to enable seamless integration of multimodal data streams and to allow agents developed by different vendors to interoperate without requiring tight coupling or proprietary interfaces.
The communication strategy is designed to support hierarchical deployment across edge and cloud environments. Edge-level communication enables low-latency data exchange and real-time decision-making close to the data sources, which is critical for time-sensitive operations such as monitoring and actuation. In parallel, cloud-level synchronization is intended to support model updates, centralized logging, historical data analysis, and long-term optimization. This hybrid edge-cloud coordination is designed to ensure robustness under intermittent connectivity conditions while maintaining overall system coherence and scalability. Beyond communication and coordination, the transition to autonomous operation also requires explicit mechanisms to ensure safe and accountable decision execution.
Given the transition from decision-support systems to autonomous intervention, the proposed framework is designed to incorporate explicit human-on-the-loop safety mechanisms to ensure reliability, accountability, and regulatory compliance in high-risk operations. High-impact actions, such as pesticide application or large-scale intervention planning, are designed to be subject to a safety validation mechanism that evaluates decisions before execution. This mechanism integrates:
- Confidence-aware decision thresholds, derived from multimodal belief estimates, ensuring that actions are triggered only when sufficient certainty is achieved.
- Rule-based safety constraints, including regulatory limits on pesticide usage, environmental conditions (e.g., wind speed), and spatial restrictions.
- Human override interfaces, enabling operators to monitor, approve, or abort actions in real time.
Formally, an action is executed only if it satisfies:
where
- t denotes the discrete time step in the sequential decision-making process.
- denotes a confidence function that maps the fused belief state to a scalar confidence score associated with the decision.
- is a predefined safety threshold.
- encodes domain and regulatory constraints.
Additionally, the framework is designed to log all decisions and actions together with associated evidence (sensor data, model outputs, confidence scores), enabling traceability and post hoc auditing, which is essential for liability management. This human-on-the-loop design is intended to ensure that the system maintains operational autonomy while preserving human oversight in safety-critical scenarios.
2.4. Core Functions of Agentic AI-Based IoT Precision Agriculture Framework
The proposed agentic AI-based IoT PA framework consists of multiple autonomous agents that collaboratively manage agricultural processes across sensing, intelligence, and actuation layers.
Data Collection and Analysis. Multimodal data are continuously collected from IoT soil sensors, weather stations, UAV-based imaging systems, and satellite platforms. These data streams are fused and analyzed to maintain an up-to-date situational representation of field conditions, including soil status, crop development, and pest activity.
Autonomous Decision-Making. Based on real-time data analysis, decision agents autonomously determine optimal strategies for irrigation, fertilization, pest control, and planting schedules. Decisions are guided by system objectives such as yield stability, resource efficiency, and environmental sustainability.
Autonomous Operations and Actuation. Agentic AI systems directly execute decisions through connected robotic platforms and IoT-enabled actuators. Operations such as precision spraying, planting, harvesting, and weed control are performed autonomously without continuous human oversight.
Adaptive Learning and Optimization. The system continuously learns from environmental feedback and operational outcomes. Models and policies are incrementally updated, enabling adaptation to seasonal variability, evolving pest behavior, and long-term environmental changes.
Predictive Analytics and Risk Management. Predictive models forecast crop growth, pest outbreaks, disease risk, and adverse weather conditions, enabling proactive management and dynamic risk mitigation strategies.
Collectively, these capabilities result in increased productivity and yield consistency, reduced input costs and resource wastage, minimized chemical usage, and improved precision and timeliness of agricultural decision-making.
2.5. Key Challenges of AI-Based IoT Precision Agriculture
Despite its transformative potential, the deployment of the proposed AI-based IoT precision agriculture framework faces several technical, operational, and environmental challenges, as shown in Figure 2.
Figure 2.
Overview of key challenges in AI-based IoT precision agriculture systems. The figure illustrates critical limitations encountered in real-world deployments, including: (i) weak connectivity and unreliable network coverage in remote agricultural areas; (ii) energy constraints and battery limitations of sensors, UAVs, and edge devices; (iii) integration issues due to heterogeneous, multi-vendor hardware and software ecosystems; (iv) data security and privacy concerns, including exposure to cyber threats; and (v) high costs associated with equipment, maintenance, and infrastructure. These factors pose significant barriers to scalable, robust, and secure deployment of autonomous and intelligent agricultural systems.
Data Quality and Quantity. Accurate decision-making depends on high-quality, reliable, and sufficiently diverse data. Noisy, incomplete, or inconsistent sensor readings can lead to incorrect inferences and suboptimal actions. Additionally, acquiring representative datasets across varying seasons, soil conditions, and crop types remains a significant challenge for robust model training and validation.
Connectivity and Network Constraints. Precision agriculture often operates in remote or rural regions where reliable internet connectivity is limited [53,54]. Poor or intermittent network access restricts real-time data transmission, remote system management, and coordinated agent behavior, particularly for time-sensitive operations such as pest control.
Integration of Devices and Systems. Agricultural environments typically employ heterogeneous IoT devices, sensors, UAVs, and legacy systems from multiple vendors. Integrating these components into a unified platform for sensing, control, and decision-making requires standardized interfaces, interoperability protocols, and flexible system architectures.
Data Security and Privacy. The increasing collection and transmission of sensitive agricultural data expose systems to cyber threats and unauthorized access [55,56,57]. Ensuring secure communication, data integrity, and privacy preservation is critical, especially in autonomous systems that directly control physical actuators.
High Implementation Costs. Deploying IoT sensors, UAV platforms, edge computing infrastructure, and AI analytics involves substantial initial investment. For small- and medium-scale farmers, these costs can impede adoption unless scalable, cost-effective, and modular solutions are developed.
Technical Expertise Requirements. Designing, operating, and maintaining agentic AI-based IoT systems require interdisciplinary expertise spanning agriculture, AI, robotics, and networking. The lack of trained personnel represents a significant barrier to large-scale deployment and long-term sustainability.
Scalability. Scaling precision agriculture solutions from small experimental farms to large commercial operations introduces challenges related to data volume, agent coordination, system reliability, and computational efficiency.
Device Power Management. IoT sensors and field-deployed devices often rely on limited power sources. Ensuring long-term, energy-efficient operation through low-power hardware, adaptive sensing strategies, and energy-aware agent decisions is essential.
Environmental Robustness. Sensors, UAVs, and robotic equipment must operate reliably under harsh environmental conditions, including extreme temperatures, precipitation, dust, and mechanical stress.
Data Integration and Analysis. Processing and analyzing large volumes of heterogeneous, real-time data streams from diverse sources remains computationally demanding. Effective data fusion, edge analytics, and distributed learning mechanisms are required to generate timely and actionable insights.
Many of the challenges outlined above—such as scalability, responsiveness, and resource optimization—cannot be fully addressed by centralized or human-in-the-loop systems. Agentic AI-based PA offers a principled solution by enabling distributed intelligence, autonomous action, and continuous learning. Through coordinated agent behavior, the proposed model enhances robustness, adaptability, and sustainability, laying the foundation for next-generation precision agriculture systems.
2.6. Reference Agent Workflow for Pesticide Management
Pesticide management represents a critical and high-impact application domain for agentic AI-based IoT precision agriculture, as it requires spatially and temporally precise interventions to balance effective pest control with environmental sustainability and economic efficiency. Conventional blanket spraying approaches often lead to chemical overuse, environmental contamination, and increased operational costs, while existing precision agriculture solutions remain largely advisory and dependent on human execution. The proposed agentic AI framework enables a transition to autonomous, closed-loop pesticide management through coordinated, goal-driven agents.
The pesticide management reference workflow operates as a continuous perception–decision–action loop, where autonomous agents collaboratively monitor crop and pest conditions, reason over multimodal data, and execute targeted interventions in real time. The system leverages IoT sensors, UAV-based remote sensing, and predictive analytics to ensure pesticides are applied only when necessary, at the correct dosage, location, and time.
System objectives for pesticide management use-case are:
- Optimize pesticide application for maximum pest control efficiency;
- Minimize chemical usage, environmental contamination, and operational waste;
- Enable real-time, adaptive decision-making under dynamic field conditions;
- Support sustainable and regulation-compliant agricultural practices.
Based on these objectives, the proposed agentic AI-based IoT Precision Agriculture Framework should apply these agents:
Monitoring Agents. Monitoring agents continuously collect field data using IoT soil sensors, weather stations, UAV-mounted multispectral cameras, and satellite imagery. These agents monitor crop health indicators, pest presence, micro-climatic conditions, and soil status, producing structured, time-stamped datasets that reflect current field conditions.
Data Processing and Analysis Agents. Analysis agents process incoming data streams using computer vision, machine learning classifiers, and anomaly detection techniques to identify pest infestations, crop stress patterns, and environmental risk factors. The outputs include pest hotspot maps, infestation severity levels, and spatial crop health assessments.
Decision-Making Agents. Decision agents autonomously determine pesticide application strategies based on analyzed data. These agents select appropriate pesticide types, calculate optimal dosages, and determine application timing while considering pest severity, crop growth stage, weather conditions, and regulatory constraints. Decisions are goal-driven, prioritizing effectiveness, safety, and sustainability.
Planning and Scheduling Agents. Planning agents translate decisions into executable operational plans. They coordinate UAVs or robotic sprayers by optimizing flight paths, scheduling application windows, allocating resources, and adapting plans in response to changing weather or operational constraints.
Feedback and Learning Agents. Feedback agents monitor post-application outcomes through continued sensing and observation. The effectiveness of interventions is evaluated, and learning agents update models and policies accordingly, enabling continuous improvement and long-term adaptation to evolving pest behavior and environmental conditions.
Through coordinated agent behavior, the proposed use-case enables variable-rate, site-specific pesticide application, significantly reducing chemical usage and environmental impact while maintaining or improving crop protection effectiveness. The autonomous, closed-loop design enhances responsiveness to rapidly changing field conditions, reduces dependence on human intervention, and supports scalable, sustainable precision agriculture operations.
The above workflow specifies a reference use-case for agentic crop protection, emphasizing functional requirements (evidence fusion, safety/compliance gates, planning, traceability, and feedback) independent of any specific model or dataset. In Section 3, we instantiate a simplified subset of this loop using publicly available visual and environmental data as sensing proxies in order to demonstrate feasibility and modular agent interoperability; physical actuation (e.g., UAV spraying) is treated conceptually as the downstream action interface rather than a field-deployed component.
3. Proof-of-Concept Evaluation
The evaluation illustrates the feasibility and coordination capabilities of the proposed agentic framework, highlighting how individual components are integrated within the overall workflow. The objective of this section is to provide a proof-of-concept validation of the proposed agentic AI-based IoT Precision Agriculture framework using publicly available data sources. Rather than benchmarking individual machine learning models against state-of-the-art performance, the experiments are designed to demonstrate the feasibility, modularity, and interoperability of autonomous agents operating in a closed perception–decision–action loop. The experimental study illustrates how heterogeneous data modalities, including robotic imagery and environmental sensor data, can be jointly processed and reasoned upon to support autonomous agricultural decision-making.
3.1. Experimental Objectives
The experimental evaluation is guided by the following objectives:
- To demonstrate the integration of multimodal sensing data, including visual information and environmental measurements, within an agentic AI framework.
- To validate the ability of autonomous agents to reason over dynamic environmental conditions and visual observations.
- To illustrate closed-loop decision-making and action selection in representative precision agriculture scenarios.
- To assess the flexibility of the proposed architecture in supporting different sensing and actuation configurations.
3.2. Public Datasets and Data Sources
To enable reproducibility and avoid reliance on proprietary data, the experimental demonstration utilizes two publicly available datasets that capture complementary modalities relevant to precision agriculture: environmental IoT data and annotated robotic imagery depicting plant health status. These datasets serve as proxies for real-world sensing streams that an agentic AI-based IoT precision agriculture system would process and reason upon during crop monitoring and pest management, thereby instantiating the monitoring and perception components of the agentic layer introduced in Section 2.
The first dataset is the Environmental Data for Powdery Mildew and Downy Mildew [58]. This dataset consists of environmental sensor recordings spanning from 2020 to 2024 captured from a vineyard IoT deployment. It includes timestamped environmental measurements of air temperature, relative humidity, and precipitation, aggregated daily. For each day, minimum, maximum, and mean values of temperature and humidity are provided, capturing both average microclimatic conditions and short-term extremes. In addition, the dataset includes binary labels indicating the presence or absence of powdery and downy mildew events, serving as ground-truth annotations for disease occurrence. These daily summaries provide a comprehensive overview of vineyard microclimatic conditions across multiple growing seasons while preserving information critical for disease risk assessment. The data were collected under the AgriDataValue Horizon Europe research project, which emphasizes open agricultural data integration and analytics frameworks. This dataset is particularly suitable for our demonstration because it provides real-world environmental conditions that directly influence crop health, pest development, and disease pressure. By incorporating temperature and humidity streams, the dataset enables evaluation of environmental reasoning agents within the proposed framework. In the context of the agentic architecture, these data instantiate the environmental monitoring layer and support downstream reasoning and decision components. Furthermore, the disease labels enable visual correlation between sensor conditions and field outcomes, supporting the analysis of agents decision-making behaviors. Figure 3 illustrates representative temperature, humidity, and precipitation patterns used as environmental inputs for autonomous agent reasoning.
Figure 3.
Environmental sensor data from the publicly available vineyard dataset, showing temperature, relative humidity, and precipitation measurements over time, together with a timeline of observed downy and powdery mildew events. Disease occurrence is visualized using horizontal event bars for improved interpretability. These data streams serve as proxy IoT inputs for evaluating environmental reasoning and decision-making agents within the proposed framework.
The second dataset utilized is the Downy Mildew and Powdery Mildew Symptoms collection hosted on Mendeley Data [59,60]. This dataset contains a set of annotated field images organized into two yearly subsets (2021 and 2022), with annotated bounding boxes for instances of powdery and downy mildew symptoms on grapevine leaves. The 2021 subset includes 753 images with 1696 powdery mildew and 657 downy mildew annotations. The 2022 subset expands the dataset with 1404 field images organized across 13 acquisition sessions, providing broader temporal and environmental coverage compared to the previous year. In total, the 2022 subset contains 3915 powdery mildew annotations and 20,959 downy mildew annotations, reflecting increased disease prevalence and visual diversity. This expanded annotation volume enhances the robustness of visual perception experiments by capturing a wider range of symptom manifestations, growth stages, and field conditions.
The annotations are provided in a standard bounding box format suitable for object detection and semantic analysis tasks. This dataset provides rich visual information that complements the environmental IoT stream. Within the context of the experimental validation, it enables the demonstration of robotic or drone vision agents tasked with identifying visible crop stress and disease signs. These visual inputs correspond to the perception layer of the agentic framework, transforming raw imagery into structured semantic evidence used by higher-level reasoning agents. These visual cues are essential for assessing the integration of perception-based reasoning with environmental data and for illustrating the decision-making capabilities of the agentic AI system under distinct multimodal inputs. Figure 4 presents sample annotated images used to demonstrate perception-driven reasoning and disease localization by vision-based agents.
Figure 4.
Representative annotated images from the downy mildew and powdery mildew symptoms dataset. Bounding boxes indicate visible disease symptoms on grapevine leaves, serving as visual inputs for perception agents operating within the agentic AI framework.
Due to the structure of Dataset 2022, in which images are organized into a limited number of acquisition folders and frequently capture the same leaves from slightly different viewpoints, a strict spatial or varietal separation of the training, validation, and test sets was not feasible. To mitigate the risk of visual overlap and overly optimistic performance estimates, Dataset 2022 was partitioned within each acquisition folder into training, validation, and test subsets based on file creation time. This temporally ordered splitting strategy reduces the likelihood of near-duplicate samples appearing across subsets and follows the evaluation protocol defined by the dataset authors [60].
The resulting partition consists of 710 training images, 347 validation images, and 347 test images, hereafter referred to as Train 2022, Val 2022, and Test 2022, respectively. Using these subsets, two experimental configurations were defined. In the first experiment, models are trained on Train 2022, validated on Val 2022, and evaluated on Test 2022, representing a standard within-season evaluation. In the second experiment, the same models trained on Train 2022 are evaluated on the complete Dataset 2021 without any additional fine-tuning, enabling an assessment of cross-season generalization and model robustness under different environmental and acquisition conditions. By restricting training exclusively to Dataset 2022 and reserving Dataset 2021 solely for testing, the experimental design avoids data leakage across seasons and provides a realistic evaluation of deployment scenarios in which models trained on recent data are applied to historical or unseen field conditions.
The two datasets together represent key sensing modalities that a field-deployed precision agriculture system must integrate: climatic and environmental variables and visual indicators of biological stress. Jointly, they operationalize the monitoring and perception layers of the proposed agentic architecture, enabling downstream reasoning and coordination to be evaluated in a proof-of-concept setting. By choosing these datasets, we demonstrate that the proposed framework can operate on realistic data that mimics sensor streams commonly encountered in agricultural contexts, even in the absence of proprietary vineyards or robotic platforms.
Moreover, the environmental dataset serves as a proxy for continuous IoT streams (e.g., temperature, humidity), enabling evaluation of environmental reasoning agents, while the image dataset provides annotated field-level visual cues for perception agents. Their integration supports the multimodal evidence fusion step defined in the coordination layer of Section 2. The use of open datasets with community-supported standards ensures that our experiments are reproducible and comparable for future research extensions.
3.3. Experimental Setup and Agent Workflow
The overall workflow is explicitly outlined to provide a clear and reproducible description of the main processing steps and their integration within the proposed framework. The experimental setup follows the architectural principles described in Section 2 and instantiates a set of autonomous agents operating over complementary sensing modalities. Visual perception agents process image-based observations acquired from robotic or UAV platforms, while environmental reasoning agents analyze time-series data originating from IoT sensors. The objective of this setup is to demonstrate how heterogeneous machine learning models can be embedded within an agentic framework and coordinated in a closed perception–decision–action loop, consistent with the perception, reasoning, and decision layers defined in the proposed agentic architecture (Section 2), with explicit consideration of energy efficiency and deployment constraints.
3.3.1. Visual Perception and Object Detection Using YOLO26
Visual perception within the proposed framework is implemented using the YOLO26 family of object detection models [61], which are integrated into dedicated perception agents responsible for detecting and localizing visible crop diseases and stress indicators in field imagery. YOLO26 follows a single-stage detection paradigm, enabling end-to-end object localization and classification within a unified neural architecture. Within the agentic layer described in Section 2, these perception agents correspond to the monitoring and analysis components that transform raw sensory input into structured field-state representations.
A key advantage of YOLO26 is its emphasis on real-time inference and scalability across a wide range of computational and energy budgets. In the experimental setup, four YOLO26 variants are employed: YOLO26n (nano), YOLO26s (small), YOLO26m (medium), and YOLO26l (large). These variants differ in network depth, width, and parameter count, enabling systematic exploration of the trade-off between detection robustness, inference latency, and energy consumption. This trade-off is particularly relevant in precision agriculture, where perception models are often deployed on energy-constrained edge platforms such as UAVs or mobile robots.
Within the agentic architecture, YOLO26-based perception agents function as semantic sensors that convert raw visual data into structured observations, including disease presence indicators, spatial localization information, and confidence scores. Importantly, the selection of a specific YOLO26 variant is treated as an agent-level decision rather than a static design choice. This adaptive behavior reflects the coordination logic of the agentic decision layer introduced in Section 2, where agents dynamically adjust sensing and inference intensity based on operational context. For example, lightweight configurations such as YOLO26n can be employed for continuous low-power monitoring, while higher-capacity variants such as YOLO26l can be selectively activated when increased detection reliability is required. This adaptive model selection mechanism directly supports the principle of net-positive AI energy by minimizing unnecessary computational and energy expenditure while preserving operational effectiveness.
Detection performance is evaluated using standard object detection metrics. Specifically, mean Average Precision at an Intersection-over-Union threshold of 0.5 (mAP50) and mean Average Precision averaged over IoU thresholds from 0.5 to 0.95 in steps of 0.05 (mAP50-95) are reported, while mAP50 captures detection capability under a relaxed overlap criterion, mAP50-95 provides a more stringent and comprehensive assessment of localization accuracy. In addition to accuracy-based metrics, inference speed (measured in milliseconds per image), number of model parameters (in millions), and computational complexity expressed in floating-point operations (FLOPs, in billions) are reported. These metrics collectively characterize the accuracy–efficiency–energy trade-offs of different YOLO26 variants, which is central to agent-level model selection in the proposed framework.
All YOLO26 models are trained using a consistent configuration to ensure a fair and reproducible evaluation across different model variants. Specifically, training is performed for 100 epochs using an input image resolution of pixels and a batch size of 4, constrained by GPU memory limitations. The models are initialized from pretrained weights and trained on a custom dataset defined via a standard YOLO-format configuration file. Optimization is carried out using the default Ultralytics training pipeline, which employs stochastic gradient-based optimization with adaptive learning rate scheduling. Data preprocessing and augmentation follow the standard YOLO training procedures, including geometric and photometric transformations such as scaling, flipping, and color adjustments, which improve generalization under varying field conditions. All experiments are conducted on a single GPU device, and no additional hyperparameter tuning is performed beyond the default settings, ensuring consistency across model comparisons. Validation performance on Val 2022 is monitored throughout training, and the model weights corresponding to the best validation performance are selected for final evaluation.
By standardizing the training configuration and reporting both performance and computational metrics, the experimental setup enables a transparent and reproducible analysis of how different YOLO26 variants balance detection accuracy, inference speed, and energy efficiency within an agentic precision agriculture context.
3.3.2. Environmental Sensing Data Reasoning Using TabPFN
Environmental sensing data, including temperature, relative humidity, and precipitation statistics, are processed by environmental reasoning agents implemented using TabPFN [62,63]. TabPFN is a probabilistic foundation model designed for tabular data, offering strong predictive capabilities without the need for task-specific retraining or extensive feature engineering. These agents instantiate the environmental reasoning component of the agentic layer defined in Section 2, contributing temporal and contextual information to the shared field-state representation.
From an energy-efficiency perspective, TabPFN is well-suited for agentic precision agriculture systems. Its ability to perform inference in few-shot or zero-shot settings significantly reduces the computational cost associated with model training and retraining, which is a non-trivial component of AI energy consumption. This characteristic aligns with the net-positive AI energy objective by reducing lifecycle energy costs while maintaining adaptive reasoning capabilities.
In the experimental workflow, TabPFN-based agents operate on aggregated environmental features, such as daily minimum, maximum, and mean temperature and humidity values, as well as precipitation measurements. These features capture both average conditions and extreme environmental events that influence fungal disease development. The probabilistic outputs generated by TabPFN provide uncertainty-aware assessments that can be leveraged by decision-making agents to balance intervention urgency against energy and resource usage.
For model evaluation, the dataset is randomly split into training and test subsets using an 80/20 ratio. To address class imbalance present in the training data, oversampling is applied exclusively to the training split using the Synthetic Minority Over-sampling Technique (SMOTE) [64]. This procedure generates synthetic samples for minority classes, resulting in a more balanced training distribution while preserving the original class proportions in the held-out test set.
Model performance is evaluated using both global and class-specific metrics to provide a comprehensive assessment of predictive behavior under class-imbalanced conditions. At the global level, Accuracy and the Receiver Operating Characteristic Area Under the Curve (ROC AUC) are reported. Accuracy provides an overall measure of correct classifications, while ROC AUC captures the model’s ability to discriminate between classes across varying decision thresholds, offering a threshold-independent evaluation of ranking quality. To better understand class-wise behavior, Precision, Recall, and the F1-score are computed for both classes individually. Precision quantifies the reliability of positive predictions, Recall measures the ability to detect true positive cases, and the F1 score provides a harmonic mean of Precision and Recall, balancing detection performance and false alarm rates. These evaluation metrics are consistent with those employed in the original study from which the dataset is derived [58], ensuring methodological comparability.
3.3.3. Multimodal Agent Coordination, Decision Flow, and Energy Awareness
The outputs of YOLO26-based visual perception agents and TabPFN-based environmental reasoning agents are integrated through a multimodal agent coordination layer. This integration corresponds to the decision and coordination layer of the agentic framework (Section 2), where heterogeneous evidence streams are fused prior to action selection. Visual detections provide spatial and semantic context, while environmental agents contribute temporal and climatic context, influencing disease development and progression.
Decision-making agents fuse these heterogeneous inputs to determine appropriate actions, such as prioritizing specific field regions for closer inspection, adjusting monitoring frequency, or planning targeted pesticide application via UAVs. In alignment with the reference operational workflow defined earlier, these decisions simulate the transition from perception to planning within the closed-loop agentic control cycle. Crucially, these decisions are made with explicit awareness of energy and resource constraints. For instance, agents may reduce perception frequency or switch to lower-complexity YOLO26 variants during periods of low risk, thereby conserving energy, or escalate sensing and analysis only when environmental and visual indicators jointly suggest elevated disease risk.
Overall, this experimental setup demonstrates how a single object detection family with multiple capacity variants (YOLO26n, YOLO26s, YOLO26m, and YOLO26l), combined with an energy-efficient tabular foundation model, can be orchestrated within an agentic AI framework. The coordination behavior mirrors the layered agentic architecture proposed in Section 2, albeit instantiated here as a proof-of-concept without physical actuation. The emphasis is placed on autonomy, adaptability, and net-positive AI energy principles, rather than on static, performance-driven model deployment, reinforcing the sustainability objectives of the proposed precision agriculture system.
3.4. Results and Observations
The reported results illustrate the behavior and integration of the proposed framework within the selected datasets. A comprehensive validation, including cross-dataset evaluation, robustness analysis, and component-level studies, is beyond the scope of this work and is considered part of future research. The experimental scenarios demonstrate that the proposed agentic framework can adapt decisions based on variations in environmental conditions and visual inputs. Changes in temperature and humidity values influence agent assessments of crop stress and risk levels, while visual cues contribute to localized decision-making related to monitoring and intervention. Although the results are qualitative in nature, they highlight the advantages of distributed agent reasoning and closed-loop autonomy compared to static, rule-based precision agriculture pipelines.
3.4.1. Visual Detection Results and Cross-Season Generalization
This section presents the experimental results obtained from the visual data using the YOLO26 family of object detection models. All models were trained exclusively on Dataset 2022 and evaluated under two conditions: (i) in-season testing on Test 2022 and (ii) cross-season evaluation on the complete Dataset 2021. The reported metrics include detection accuracy, inference speed, and model complexity, enabling a comprehensive assessment of accuracy–efficiency trade-offs relevant to agentic and energy-aware deployment.
Table 3 reports detection performance on Test 2022. Consistent improvements in detection accuracy are observed as model capacity increases from YOLO26n to YOLO26l. Specifically, mAP50 improves from 72.0 to 77.2, while mAP50-95 increases from 47.1 to 51.9. These gains come at the cost of increased inference latency and computational complexity, with YOLO26n achieving the lowest inference time (5.5 ms) and YOLO26l exhibiting the highest parameter count and FLOPs. This trend highlights the expected accuracy–efficiency trade-off and confirms that lightweight variants can provide competitive performance under strict latency and energy constraints.
Table 3.
Detection performance of YOLO26 variants evaluated on Test 2022. The table reports detection accuracy using standard metrics (mAP50 and mAP50-95), along with inference speed (milliseconds per image), model size (parameters in millions), and computational complexity (FLOPs in billions) across different model scales.
Table 4 summarizes cross-season generalization results on Dataset 2021. As expected, all models experience a noticeable drop in detection performance compared to in-season evaluation, reflecting changes in environmental conditions, acquisition settings, and disease appearance across seasons. Nevertheless, higher-capacity models demonstrate improved robustness under domain shift. YOLO26m and YOLO26l achieve mAP50-95 values of 56.3 and 56.9, respectively, substantially outperforming YOLO26n and YOLO26s in this setting. This suggests that increased model capacity enhances generalization to unseen seasonal conditions, albeit with higher computational and energy costs.
Table 4.
Cross-season generalization performance of YOLO26 variants trained on Dataset 2022 and evaluated on Dataset 2021. The table reports detection accuracy (mAP50-95 and mAP50), inference speed (milliseconds per image), model size in terms of parameters (millions), and computational complexity (FLOPs in billions) for different model scales.
From an agentic perspective, these results support adaptive model selection strategies. Lightweight models such as YOLO26n are well-suited for continuous, low-latency monitoring, while larger variants can be selectively deployed when higher detection robustness or cross-season reliability is required. This flexible use of model variants aligns with the net-positive AI energy principle by enabling agents to balance detection performance against inference cost and energy consumption depending on operational context.
3.4.2. Results from the Sensing Data
This section presents the data from the experiments on the sensing data. The results are presented on Table 5. Performance is reported for downy mildew and powdery mildew prediction using accuracy, ROC AUC, and class-wise precision, recall, and F1-score.
Table 5.
TabPFN classification performance for grapevine disease prediction.
For downy mildew, the model achieves an overall accuracy of 0.9649 and a ROC AUC of 0.9436, indicating strong discriminative performance. Weighted precision, recall, and F1-score all exceed 0.96, while the macro F1-score reaches 0.9235, reflecting the increased difficulty of disease-positive prediction. Class-wise analysis shows very strong performance for the no-disease class (F1-score of 0.9798), whereas the disease class attains a recall of 0.8180 and an F1-score of 0.8672. This imbalance is expected in early disease risk assessment, where disease conditions are rarer and less sharply separable based on environmental factors alone.
Similar trends are observed for powdery mildew prediction. The model achieves an accuracy of 0.9643 and a ROC AUC of 0.9536, with a macro F1-score of 0.9217. Disease-class recall reaches 0.8206, indicating consistent sensitivity across both disease targets. The close alignment of results for downy and powdery mildew suggests that the learned decision boundaries generalize well across related disease processes influenced by similar climatic conditions.
Overall, these results demonstrate that transformer-based models designed for tabular data can effectively exploit IoT-derived environmental features for early grapevine disease risk prediction. The strong in-distribution performance supports deployment in continuous monitoring and decision-support scenarios, where models are periodically retrained under stable data distributions. However, as evaluation is based on a random split of augmented data, the reported metrics primarily reflect in-distribution generalization; assessing robustness under temporal or cross-season shifts remains an important direction for future work.
In the current setup, perception and reasoning components are evaluated individually, while their coordination is defined at the architectural level. A fully integrated closed-loop evaluation of the multi-agent system is left for future work.
3.5. Discussion and Limitations
The current evaluation is limited to vineyard disease datasets. It should be interpreted as a use-case-specific proof-of-concept. At the same time, the broader applicability of the proposed framework remains conceptual and will be validated in future work across diverse agricultural scenarios.
The presented experiments rely on publicly available datasets and do not constitute a fully synchronized real-world deployment. As a result, the evaluation should be interpreted as a proof of concept rather than a performance-oriented assessment. The current setup does not incorporate real-time sensor streams, closed-loop actuation, or long-term system operation, which are essential components for validating autonomous agricultural systems under operational conditions.
Several important directions for future work emerge from these limitations. First, long-term field deployments using real sensor networks are required to assess system stability, data reliability, and performance over extended periods and under varying environmental conditions. Such deployments would enable evaluation of temporal consistency, data drift, and seasonal variability, which are critical factors in agricultural applications.
Second, the integration of the proposed framework with autonomous robotic platforms, including UAVs and ground-based systems, represents a key step toward realizing closed-loop perception-decision-action workflows. This would allow validation of real-time coordination, task execution, and system responsiveness in practical scenarios.
Third, the development of uncertainty-aware decision policies remains an important research direction. While the current framework incorporates belief representations and confidence-based reasoning, future work should explore more explicit mechanisms for uncertainty quantification and propagation, enabling more robust and risk-aware decision-making, consistent with the belief-based formulation introduced in Section 2.2.
Finally, robustness under environmental variability, including changes in illumination, weather conditions, crop growth stages, and seasonal dynamics, requires further investigation. Evaluating the framework across diverse conditions and geographic regions will be essential to ensure generalization and scalability.
Overall, these directions highlight the need for comprehensive real-world validation and continued development of the proposed framework, bridging the gap between conceptual design and operational deployment.
4. Conclusions
This paper articulated a vision and reference architecture for agentic AI-based IoT precision agriculture, emphasizing a shift from recommendation-centric analytics to closed-loop autonomy spanning perception, deliberation, coordination, and actuation. The proposed framework models agricultural operations as a distributed set of goal-driven agents that continuously maintain field state under uncertainty, negotiate resource constraints, and trigger context-aware interventions, with pesticide management serving as a representative and high-impact use-case.
To ground the conceptual framework, we provided a proof-of-concept experimental demonstration using open datasets that approximate two core sensing modalities in real deployments: field imagery and environmental IoT measurements. Vision-based perception agents built on YOLO26 localized disease symptoms from grapevine images and exposed an explicit accuracy–latency trade-off across model variants, supporting adaptive model selection under energy constraints. In parallel, TabPFN-based environmental reasoning agents inferred disease risk from temperature, humidity, and precipitation features while avoiding costly task-specific retraining. The combined workflow illustrates how multimodal evidence can be fused by decision-making agents to adapt sensing intensity and computational effort to operational conditions, aligning with net-positive AI principles in which resource and chemical savings can outweigh the energy cost of AI inference.
Several limitations remain. Our evaluation does not constitute a fully synchronized field deployment with real-time streaming, actuation feedback, and safety-certified intervention policies. Cross-season generalization results highlight the practical importance of robustness under domain shift, motivating continual monitoring, calibration, and data governance mechanisms. Furthermore, reliable autonomy requires explicit treatment of connectivity constraints, interoperability across heterogeneous vendors, and formal safety/compliance gates for pesticide application.
Future work will focus on end-to-end field validation in vineyard environments, including: (i) online/continual learning under seasonal drift, (ii) explicit uncertainty-aware planning and human override interfaces, (iii) edge deployment on UAV/robot hardware with measured energy budgets, and (iv) traceability mechanisms that log evidence, decisions, and actions for audit and regulation. Overall, the presented framework provides a scalable foundation for sustainable, autonomous crop protection and broader precision agriculture workflows.
Author Contributions
Conceptualization, D.D., S.K. and K.M.; methodology, D.D. and I.D.; software, I.D. and I.K.; validation, I.D., I.K. and K.M.; formal analysis, D.D. and S.K.; investigation, D.D., S.K. and I.D.; resources, I.D.; data curation, I.D.; writing—original draft preparation, D.D., S.K., I.D. and I.K.; writing—review and editing, I.D.; visualization, S.K., I.D. and I.K.; supervision, D.D. and K.M.; project administration, D.D.; funding acquisition, S.K., I.D., I.K. and K.M. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
The data used in this study are publicly available from the sources cited in References [58,59,60]. The proof-of-concept evaluation relies on two openly available datasets: the Environmental Data for Powdery Mildew and Downy Mildew dataset and the Downy Mildew and Powdery Mildew Symptoms collection hosted on Mendeley Data. No new proprietary data were created; the study uses public datasets to support reproducibility and comparability of future research.
Conflicts of Interest
The authors declare no conflicts of interest.
References
- Adve, V.; Brown, S.; Fern, A.; Ganapathysubramanian, B.; Kalyanaraman, A.; Shekhar, S.; Tagkopoulos, I.; Wedow, J. Advancing AI in Agriculture through Large-Scale Collaborative Research. Commun. ACM 2025, 68, 78–86. [Google Scholar] [CrossRef]
- Wolfert, S.; Ge, L.; Verdouw, C.; Bogaardt, M.J. Big Data in Smart Farming—A review. Agric. Syst. 2017, 153, 69–80. [Google Scholar] [CrossRef]
- Liakos, K.G.; Busato, P.; Moshou, D.; Pearson, S.; Bochtis, D. Machine Learning in Agriculture: A Review. Sensors 2018, 18, 2674. [Google Scholar] [CrossRef]
- Rose, D.C.; Chilvers, J. Agriculture 4.0: Broadening Responsible Innovation in an Era of Smart Farming. Front. Sustain. Food Syst. 2018, 2, 387545. [Google Scholar] [CrossRef]
- da Silveira, F.; Lermen, F.H.; Amaral, F.G. An overview of agriculture 4.0 development: Systematic review of descriptions, technologies, barriers, advantages, and disadvantages. Comput. Electron. Agric. 2021, 189, 106405. [Google Scholar] [CrossRef]
- Abbasi, R.; Martinez, P.; Ahmad, R. The digitization of agricultural industry—A systematic literature review on agriculture 4.0. Smart Agric. Technol. 2022, 2, 100042. [Google Scholar] [CrossRef]
- Choudhary, V.; Guha, P.; Pau, G.; Mishra, S. An overview of smart agriculture using internet of things (IoT) and web services. Environ. Sustain. Indic. 2025, 26, 100607. [Google Scholar] [CrossRef]
- Miller, T.; Mikiciuk, G.; Durlik, I.; Mikiciuk, M.; Łobodzińska, A.; Śnieg, M. The IoT and AI in agriculture: The time is now—A systematic review of smart sensing technologies. Sensors 2025, 25, 3583. [Google Scholar] [CrossRef] [PubMed]
- Kamilaris, A.; Prenafeta Boldú, F. Deep Learning in Agriculture: A Survey. Comput. Electron. Agric. 2018, 147, 70–90. [Google Scholar] [CrossRef]
- Zhu, H.; Qin, S.; Su, M.; Lin, C.; Li, A.; Gao, J. Harnessing large vision and language models in agriculture: A review. Front. Plant Sci. 2025, 16, 1579355. [Google Scholar] [CrossRef] [PubMed]
- Lytridis, C.; Kaburlasos, V.G.; Pachidis, T.; Manios, M.; Vrochidou, E.; Kalampokas, T.; Chatzistamatis, S. An overview of cooperative robotics in agriculture. Agronomy 2021, 11, 1818. [Google Scholar] [CrossRef]
- An, W.; Wu, D.; Ci, S.; Luo, H.; Adamchuk, V.; Xu, Z. Agriculture cyber-physical systems. In Cyber-Physical Systems; Elsevier: Amsterdam, The Netherlands, 2017; pp. 399–417. [Google Scholar]
- Weraikat, D.; Šorič, K.; Žagar, M.; Sokač, M. Data analytics in agriculture: Enhancing decision-making for crop yield optimization and sustainable practices. Sustainability 2024, 16, 7331. [Google Scholar] [CrossRef]
- Kassam, A.; Derpsch, R.; Friedrich, T. Global achievements in soil and water conservation: The case of Conservation Agriculture. Int. Soil Water Conserv. Res. 2014, 2, 5–13. [Google Scholar] [CrossRef]
- Clark, M.; Tilman, D. Comparative analysis of environmental impacts of agricultural production systems, agricultural input efficiency, and food choice. Environ. Res. Lett. 2017, 12, 064016. [Google Scholar] [CrossRef]
- Elijah, O.; Rahman, T.A.; Orikumhi, I.; Leow, C.Y.; Hindia, M.N. An Overview of Internet of Things (IoT) and Data Analytics in Agriculture: Benefits and Challenges. IEEE Internet Things J. 2018, 5, 3758–3773. [Google Scholar] [CrossRef]
- Patil, N.; Khairnar, D.V.D. IoT based smart farming system. Int. J. Adv. Res. Ideas Innov. Technol. 2021, 7, 1280–1285. [Google Scholar]
- Spasev, V.; Dimitrovski, I.; Kitanovski, I.; Chorbev, I. Semantic segmentation of remote sensing images: Definition, methods, datasets and applications. In International Conference on ICT Innovations; Springer: Cham, Switzerland, 2023; pp. 127–140. [Google Scholar]
- Gatkal, N.R.; Nalawade, S.M.; Bhanage, G.B.; Sahni, R.K.; Walunj, A.A.; Kadam, P.B.; Ali, M. Review of UAVs for efficient agrochemical spray application. Int. J. Agric. Biol. Eng. 2025, 18, 1–9. [Google Scholar] [CrossRef]
- Jagadeeswari, M.; Manikandababu, C.S.; S, K.; R, P.; S, M. Artificial Intelligence based Crop Recommendation System. In Proceedings of the 2022 4th International Conference on Inventive Research in Computing Applications (ICIRCA), Coimbatore, India, 21–23 September 2022; pp. 1127–1133. [Google Scholar] [CrossRef]
- Jakkani, A. Artificial Intelligence and its Applications in the Field of Internet of Things (Iot). Int. J. Res. Sci. Eng. 2024, 4, 49–61. [Google Scholar] [CrossRef]
- Lan, J.; Ban, Q. The Farm-Level Economic and Environmental Benefits of Precision Agriculture Technology Adoption: A Meta-Analysis of Global Evidence. Sustainability 2025, 17, 11223. [Google Scholar] [CrossRef]
- Atapattu, A.J.; Perera, L.K.; Nuwarapaksha, T.D.; Udumann, S.S.; Dissanayaka, N.S. Challenges in Achieving Artificial Intelligence in Agriculture. In Artificial Intelligence Techniques in Smart Agriculture; Springer Nature: Singapore, 2024; pp. 7–34. [Google Scholar] [CrossRef]
- Yu, P.; Teng, F.; Zhu, W.; Shen, C.; Chen, Z.; Song, J. Cloud–edge–device collaborative computing in smart agriculture: Architectures, applications, and future perspectives. Front. Plant Sci. 2025, 16, 1668545. [Google Scholar] [CrossRef]
- Agyeman, B.T.; Decardi-Nelson, B.; Liu, J.; Shah, S.L. A semi-centralized multi-agent RL framework for efficient irrigation scheduling. Control Eng. Pract. 2025, 155, 106183. [Google Scholar] [CrossRef]
- Srinivasu, P.N.; Pavate, A.; JayaLakshmi, G.; Shafi, J.; Choi, J.; Ijaz, M.F. Agentic AI for smart and sustainable precision agriculture. Front. Plant Sci. 2026, 16, 1706428. [Google Scholar] [CrossRef] [PubMed]
- Abou Ali, M.; Dornaika, F.; Charafeddine, J. Agentic AI: A Comprehensive Survey of Architectures, Applications, and Future Directions. Artif. Intell. Rev. 2025, 59, 11. [Google Scholar] [CrossRef]
- Toscano, F.; Fiorentino, C.; Capece, N.; Erra, U.; Travascia, D.; Scopa, A.; Drosos, M.; D’Antonio, P. Unmanned Aerial Vehicle for Precision Agriculture: A Review. IEEE Access 2024, 12, 69188–69205. [Google Scholar] [CrossRef]
- Ybañez, R.Z.; Ybañez, R.Z. Recent Advancements in Drone-Based Remote Sensing for Precision Agriculture: A Mini-Review of Applications, Challenges, and Opportunities. Asian J. Res. Agric. For. 2025, 11, 253–261. [Google Scholar] [CrossRef]
- Sreenatha, A.; Nagaraj, K.S.; Kumar, A.; Reddy, T.B.M.; Mushrif, S.K.; Narabenchi, G.; Krishna, H.C.; Dhananjaya, P. Applications of Unmanned Aerial Vehicles (UAVs) in Agriculture: A Review. Int. J. Res. Agron. 2025, 8, 292–298. [Google Scholar] [CrossRef]
- Anam, I.; Arafat, N.; Hafiz, M.S.; Jim, J.R.; Kabir, M.M.; Mridha, M. A systematic review of UAV and AI integration for targeted disease detection, weed management, and pest control in precision agriculture. Smart Agric. Technol. 2024, 9, 100647. [Google Scholar] [CrossRef]
- Safaeinejad, M.; Ghasemi-Nejad-Raeini, M.; Taki, M. Reducing energy and environmental footprint in agriculture: A study on drone spraying vs. conventional methods. PLoS ONE 2025, 20, e0323779. [Google Scholar] [CrossRef]
- Mansoor, S.; Iqbal, S.; Popescu, S.M.; Kim, S.L.; Chung, Y.S.; Baek, J.H. Integration of smart sensors and IOT in precision agriculture: Trends, challenges and future prospectives. Front. Plant Sci. 2025, 16, 1587869. [Google Scholar] [CrossRef]
- Agrawal, J.; Arafat, M.Y. Transforming Farming: A Review of AI-Powered UAV Technologies in Precision Agriculture. Drones 2024, 8, 664. [Google Scholar] [CrossRef]
- Xing, Y.; Liu, X.; Wang, X. Integrating UAVs, satellite remote sensing, and machine learning in precision agriculture: Pathways to sustainable food production, resource efficiency, and scalable innovation. Front. Agron. 2026, 7, 1670380. [Google Scholar] [CrossRef]
- Haghighat, M.; Saleh, A.; Azghadi, M.R. Multimodal language models in agriculture: A tutorial and survey. Inf. Fusion 2025, 129, 104042. [Google Scholar] [CrossRef]
- Sapkota, R.; Qureshi, R.; Hadi, M.U.; Hassan, S.Z.; Sadak, F.; Shoman, M.; Sajjad, M.; Dharejo, F.A.; Paudel, A.; Li, J.; et al. Multi-modal LLMs in agriculture: A comprehensive review. IEEE Trans. Autom. Sci. Eng. 2025, 22, 22510–22540. [Google Scholar] [CrossRef]
- Kuska, M.T.; Wahabzada, M.; Paulus, S. AI for crop production—Where can large language models (LLMs) provide substantial value? Comput. Electron. Agric. 2024, 221, 108924. [Google Scholar] [CrossRef]
- Yu, P.; Lin, B. A Framework for Agricultural Intelligent Analysis Based on a Visual Language Large Model. Appl. Sci. 2024, 14, 8350. [Google Scholar] [CrossRef]
- Cantonjos, N.A.P.; Biswas, A. AgroAskAI: A Multi-Agentic AI Framework for Supporting Smallholder Farmers’ Enquiries Globally. In Proceedings of the AAAI Conference on Artificial Intelligence; AAAI Press: Washington, DC, USA, 2026; Volume 40, pp. 40217–40223. [Google Scholar]
- Amjad, M.; Tehseen, R.; Nasr, K.; Aslam, F.; Awan, M.M.; Omer, U. Agentic AI for Autonomous Soil and Fertilization Management for Agriculture Sustainability. Int. J. Innov. Sci. Technol. 2025, 7, 2997–3017. [Google Scholar]
- Swati, N.P.; Gupta, S.V.; Duddela, N.S.; Parvathy, L.R. Agentic AI-driven autonomous decision support system for smart agriculture. Sci. Rep. 2026, 16, 9972. [Google Scholar] [CrossRef]
- Murad, M.; Ahmed, M.; din, N.u.; Shahid, M.F.; Siddiqui, S.; Byers, D.; Tanveer, M.H.; Voicu, R.C. Agentic AI Framework to Automate Traditional Farming for Smart Agriculture. AgriEngineering 2026, 8, 8. [Google Scholar] [CrossRef]
- Toskov, B.; Toskova, A. AgroNova: An Autonomous IoT Platform for Greenhouse Climate Control. Sensors 2026, 26, 1861. [Google Scholar] [CrossRef]
- Pardo-Pina, S.; Germer, J.; Suay-Cortés, R.; Platero-Horcajadas, M.; Ferrández-Pastor, F.J. An Agent-Based Service Architecture for Smart Greenhouses: Telemetry Analytics and Decision Support with RAG-grounded LLM Agents. Smart Agric. Technol. 2026, 13, 101872. [Google Scholar] [CrossRef]
- Tariq, M.U.; Saqib, S.M.; Mazhar, T.; Khan, M.A.; Shahzad, T.; Hamam, H. Edge-enabled smart agriculture framework: Integrating IoT, lightweight deep learning, and agentic AI for context-aware farming. Results Eng. 2025, 28, 107342. [Google Scholar] [CrossRef]
- Tang, R.; Tang, J.; Talip, M.S.A.; Aridas, N.K.; Xu, X. Enhanced multi agent coordination algorithm for drone swarm patrolling in durian orchards. Sci. Rep. 2025, 15, 9139. [Google Scholar] [CrossRef]
- Ahmadi, M.; Singletary, A.; Burdick, J.W.; Ames, A.D. Safe policy synthesis in multi-agent POMDPs via discrete-time barrier functions. In Proceedings of the 2019 IEEE 58th Conference on Decision and Control (CDC), Nice, France, 11–13 December 2019; pp. 4797–4803. [Google Scholar]
- Oliehoek, F.A.; Amato, C. A Concise Introduction to Decentralized POMDPs; Springer: Cham, Switzerland, 2016; Volume 1. [Google Scholar]
- Curasma, H.P.; Pan, C.F.; Estrella, J.C. Agents for automatic control of sensors using Multi-Agent Systems and Ontologies: A scalable IoT architecture. Procedia Comput. Sci. 2024, 238, 404–411. [Google Scholar] [CrossRef]
- Chai, A.; Yin, W.; Lian, M.; Sun, Y.; Guo, C.; Wang, L.; Fang, Z. DUA-MQTT: A Distributed High-Availability Message Communication Model for the Industrial Internet of Things. Sensors 2025, 25, 5071. [Google Scholar] [CrossRef] [PubMed]
- Bonci, A.; Gaudeni, F.; Giannini, M.C.; Longhi, S. Robot operating system 2 (ros2)-based frameworks for increasing robot autonomy: A survey. Appl. Sci. 2023, 13, 12796. [Google Scholar] [CrossRef]
- Rafi, M.S.M.; Behjati, M.; Rafsanjani, A.S. Reliable and cost-efficient IoT connectivity for smart agriculture: A comparative study of LPWAN, 5G, and hybrid connectivity models. In International Conference on Smart Computing and Informatics; Springer: Cham, Switzerland, 2025; pp. 389–411. [Google Scholar]
- Nawaz, M.; Babar, M.I.K. IoT and AI for smart agriculture in resource-constrained environments: Challenges, opportunities and solutions. Discov. Internet Things 2025, 5, 24. [Google Scholar] [CrossRef]
- Dembani, R.; Karvelas, I.; Akbar, N.A.; Rizou, S.; Tegolo, D.; Fountas, S. Agricultural data privacy and federated learning: A review of challenges and opportunities. Comput. Electron. Agric. 2025, 232, 110048. [Google Scholar] [CrossRef]
- Killeen, P.; Lin, C.; Li, F.; Kiringa, I.; Yeap, T. IoT-based smart farming architecture using federated learning: A nitrous oxide emission prediction use case. ACM J. Comput. Sustain. Soc. 2025, 3, 1–38. [Google Scholar] [CrossRef]
- Vangala, A.; Das, A.K.; Chamola, V.; Korotaev, V.; Rodrigues, J.J. Security in IoT-enabled smart agriculture: Architecture, security solutions and challenges. Clust. Comput. 2023, 26, 879–902. [Google Scholar] [CrossRef]
- Arvanitis, N.; Graziosi, F.; Athanasiou, G.; Terpou, A.; Arvaniti, O.; Zahariadis, T. Utilizing TabPFN Transformer with IoT Environmental Data for Early Prediction of Grapevine Diseases. AgriEngineering 2025, 7, 173. [Google Scholar] [CrossRef]
- Ghiani, L.; Serra, S.; Sassu, A.; Deidda, A.; Deidda, A.; Gambella, F. Downy Mildew and Powdery Mildew Symptoms, Mendeley Data, Version 1. 2024. Available online: https://data.mendeley.com/datasets/5c8h6sh495/1 (accessed on 5 April 2026).
- Ghiani, L.; Serra, S.; Sassu, A.; Deidda, A.; Deidda, A.; Gambella, F. Automated detection of downy mildew and powdery mildew symptoms for vineyard disease management. Smart Agric. Technol. 2025, 11, 100877. [Google Scholar] [CrossRef]
- Jocher, G.; Qiu, J. Ultralytics YOLO, Version 8.4.36. 2026. Available online: https://github.com/ultralytics/ultralytics (accessed on 5 April 2026).
- Hollmann, N.; Müller, S.; Purucker, L.; Krishnakumar, A.; Körfer, M.; Hoo, S.B.; Schirrmeister, R.T.; Hutter, F. Accurate predictions on small data with a tabular foundation model. Nature 2025, 637, 319–326. [Google Scholar] [CrossRef]
- Hollmann, N.; Müller, S.; Eggensperger, K.; Hutter, F. TabPFN: A transformer that solves small tabular classification problems in a second. In Proceedings of the International Conference on Learning Representations 2023, Kigali, Rwanda, 1–5 May 2023. [Google Scholar]
- Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic minority over-sampling technique. J. Artif. Intell. Res. 2002, 16, 321–357. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.



