1. Introduction
Climate change has become one of the most pressing global challenges, particularly in agriculture, as it significantly impacts food production and security worldwide. Temperature changes, rainfall patterns, pests, and the frequency of extreme weather events all negatively influence the long-term productivity and sustainability of agriculture. It accelerates growth and expands the range of crop diseases and pests, which can further reduce yields and necessitate the increased use of agrochemicals, contributing to environmental pollution [
1,
2].
In this context, Controlled Environment Agriculture (CEA) has emerged as an interesting solution, offering efficient systems capable of producing fresh, local, and often organic food. It can be defined as a highly efficient agricultural method that grows plants in completely controlled indoor spaces, such as vertical farms, minimizing the use of land, water, and nutrients compared with traditional agriculture [
3]. As one of the pioneers in this field, Despommier [
4] outlined several advantages of vertical farming, including year-round crop production, elimination of weather-related losses and agricultural runoff, significant water savings, and opportunities for urban revitalization and ecosystem restoration. Despite these benefits, CEA’s sustainability is often overlooked, and a more detailed understanding of its potential is crucial for promoting its application and cultivating consumer acceptance [
5]. While CEA has the potential to provide sustainable, organic, and tasty local food, public understanding and credibility must be deepened to overcome environmental concerns about unnatural aspects [
6]. Furthermore, CEA’s limited focus and intense capital and energy needs limit its valuable contributions to the urban food system [
7].
Recent advancements in Remote Sensing (RS) and Machine Learning (ML) present opportunities to address these challenges. Improved satellite systems and sensor networks now allow researchers to collect and analyze vast amounts of agricultural data with high precision [
8,
9]. Machine learning algorithms, such as Random Forests, neural networks, and long short-term memory (LSTM) models, are already being applied across the agricultural field for tasks like crop and disease recognition, growth prediction, and yield optimization [
10,
11]. Together, they form the foundation of smart farming systems that can make CEA more precise, efficient, and sustainable.
Instead of relying on a reactive approach, farmers can adopt a predictive strategy enabled by the insights provided by RS and ML on environmental trends, thus supporting better planning and decision-making [
12,
13]. Object-based classification methods, spectral analysis, and airborne laser scanning are used to identify everything from habitat trees to plantation age, although challenges remain, particularly when it comes to extracting social and functional information from commercial remote sensing data [
14,
15,
16]. Moreover, ML-driven technologies provide valuable information for processing remote sensing data to enhance the quality and quantity of agricultural output [
17]. Hyperspectral, multispectral, and aerial images offer comprehensive insights into the health and dynamics of agricultural systems [
18].
Recent scholarly work emphasizes the role of sensor networks, big data, and the Internet of Things (IoT) in processing and analyzing agricultural data to support sustainable practices [
9]. Additionally, machine learning algorithms are widely used technologies in this field. However, not all challenges are technological. Research on land use and land cover change in the context of environmental issues has become an increasingly important topic [
12,
19]. For example, the growing concern over microplastics in soil highlights knowledge gaps regarding their long-term effects on crop productivity and food safety, an area where improved monitoring and data analysis could also play a key role [
20].
Researchers have focused on the development and optimization of machine learning models for processing, analyzing, and interpreting data retrieved from remote sensing, particularly for land cover/use classification processes and the presentation of bioscope parameter estimation [
21]. Furthermore, scholars are exploring the use of neural networks for automated crop and pest/disease recognition, long short-term memory (LSTM) networks for predicting crop growth, and various tools to support farmers in the decision-making process [
11,
22]. The literature also suggests that improved satellite systems have created a need for new models that can help analyze spectral time series [
8]. In this context, machine learning techniques, such as the Random Forest algorithm, have been widely applied in agricultural and land cover research [
10].
Although the intersection of ML and RS has been extensively studied in open-field agriculture, their integration within CEA remains underexplored. Only a handful of bibliometric analyses exist in this field. Dsouza et al. [
3] conducted a thematic analysis (
n = 610) of the research landscape of controlled environment agriculture (CEA), identifying four research areas: technical, biological, environmental, and socio-economic. They found that CEA has several benefits over traditional agriculture, including a reduced environmental impact due to its proximity to urban centers, which minimizes the need for transportation. Additionally, CEA supports urban economies by creating skilled jobs and promoting sustainability. Similarly, Abdollahi et al. [
23] carried out a bibliometric analysis of the potential of wireless sensor networks (WSN) in agriculture. They analyzed 2444 documents from the Scopus database and found that WSN applications in precision agriculture can modernize data collection, contributing to the automation of farming systems.
In contrast, Bertoglio et al. [
24] explored the Digital Agricultural Revolution (DAR) through a bibliometric analysis of 4995 articles (2012–2019), highlighting its role in enhancing sustainable farming practices to address climate change and food security. However, they identified several barriers to the widespread adoption of ML in agriculture, including the high cost of technology, lack of technical expertise among farmers, data privacy concerns, and difficulty in deploying ML models in resource-constrained rural environments. The authors argue that addressing these socio-economic and infrastructural challenges is as important as making technical advancements.
Recent research has highlighted the growing role of machine learning (ML) and deep learning (DL) in advancing precision agriculture. Rejeb et al. [
25] discuss the evolution from traditional ML algorithms (e.g., Random Forest, Support Vector Machines) to advanced deep learning methods (e.g., Convolutional Neural Networks, Recurrent Neural Networks), which have significantly improved accuracy in image-based agricultural applications. While traditional algorithms remain popular due to their interpretability and lower computational cost, deep learning (DL) models have demonstrated superior performance in image-based tasks, such as disease identification and weed classification. Ojo et al. [
22] note in their review that the majority of DL applications (82%) focus on greenhouses, with the main tasks being yield estimation (31%) and growth monitoring (21%). Other significant applications include disease detection, microclimate prediction, nutrient estimation, small insect detection, and robotic harvesting. Indoor farms and vertical farming systems also utilize DL for stress-level monitoring, growth monitoring, and yield estimation, although to a lesser extent. However, as Kabir et al. [
26] pointed out, the widespread adoption of these technologies faces barriers such as high costs and the need for specialized expertise, particularly in resource-limited regions.
Together, these findings describe a new approach to agricultural research and practice that combines data, technology, and environmental awareness to support more sustainable systems [
27]. However, the available literature is often too general or narrowly focused on specific technologies. To address this gap, the present study aims to explore the integration of Remote Sensing (RS) and Machine Learning (ML) into Controlled Environment Agriculture (CEA) through a structured, comprehensive approach. Through bibliometric and content analyses, this paper highlights the progress and key contributions of the field and identifies emerging research directions. Based on this objective, the following research questions were formulated:
RQ1: What are the key publication trends, influential articles, and the most productive authors and countries in the field of Machine Learning and Remote Sensing applied to Controlled Environment Agriculture (CEA)?
RQ2: What are the research directions for the use of Machine Learning and Remote Sensing in sustainable agriculture?
RQ3: How are Machine Learning and Remote Sensing technologies applied within CEA systems to enhance sustainability, resource management, and operational efficiency?
3. Results
To address the three interrelated research questions, we first present the results of the bibliometric analysis, focusing on publication trends, influential articles, and key contributors (RQ1). Subsequently, the identification of research directions, derived from keyword co-occurrence mapping, is presented (RQ2) and integrated with qualitative content analysis to explore how Machine Learning and Remote Sensing are applied in CEA systems to support sustainability and operational efficiency (RQ3).
As shown in
Figure 2, the first article on this topic was published in 2001, and academic activity in this area remained minimal until 2019. The number of publications began to rise as technology advanced. Starting in 2019, the number of articles continued to increase, reaching 13 papers in 2024. The upward trend is evident, especially since we are only at the beginning of 2025, and already seven articles have been published. Considering the importance of the topic, we estimate that the number of publications will exceed 20 by the end of the year. A notable observation is that, despite the clear growth in research output over the years, the overall number of publications in this field remains relatively low. This suggests that while interest and activity are increasing, the field is still in a developing stage, with plenty of room for further exploration and contribution.
As shown in
Figure 3, an analysis of the distribution of scientific production by country reveals a clear imbalance, with a few nations contributing a substantial share of publications. India leads with 12 publications, closely followed by the United States with 11 and the People’s Republic of China with 9. These three countries dominate the dataset, and this concentration reflects not only technological infrastructure but also institutional investment in agritech research. This distribution underscores the research output in a few highly active countries while also highlighting more limited participation from a wider range of countries.
In examining the most influential documents within our dataset (
Table 1), it is evident that the most cited articles reflect a broader, interdisciplinary approach. Some highly cited works emphasize the importance of soil health [
32], water management for irrigation [
33], and sustainable land management practices, especially in addressing the growing global issue of soil salinization, which poses a significant threat to long-term agricultural productivity [
34]. At the same time, several articles are directly aligned with the core theme, particularly those related to machine learning, object detection (YOLOWeeds), autonomous navigation using deep learning, and sensor/drone technologies for sustainable crop management, with citation counts ranging from 110 to 120 [
35,
36,
37,
38,
39]. The similar citation levels of these studies highlight the active research interest in using advanced technologies, such as object detection algorithms and remote sensing, for sustainable weed control and land consolidation policies. These contributions highlight how machine learning is becoming a key enabler of automation, precision, and efficiency in controlled environments.
The article “Renewable energy-powered semi-closed greenhouse for sustainable crop production using model predictive control and machine learning for energy management” [
40] represents a direct integration of machine learning within a CEA framework, showcasing sustainable innovations that align closely with the core theme of this research. Notably, one of the high-impact articles highlights the importance of developing sustainable grazing systems to enhance both land quality and the overall health and welfare of animals [
41]. Overall, the development of this field is largely based on established work in environmental management, precision agriculture, and artificial intelligence-based solutions, laying the foundation for future innovations in sustainable CEA systems.
Table 1.
Top 10 most influential papers.
Table 1.
Top 10 most influential papers.
Authors | Article Title | Citations | Year |
---|
Barrios, E [32] | Soil biota, ecosystem services, and land productivity | 611 | 2007 |
Singh, A [34] | Soil salinization management for sustainable development: A review | 275 | 2021 |
Dang, FY et al. [35] | YOLOWeeds: A novel benchmark of YOLO object detectors for multi-class weed detection in cotton production systems | 120 | 2023 |
Esposito, M et al. [36] | Drone and sensor technology for sustainable weed management: A review | 119 | 2021 |
Yurui, L. et al. [37] | Impacts of land consolidation on the rural human-environment system in typical watershed of the Loess Plateau and implications for rural development policy | 113 | 2019 |
Vilialba, JJ and Provenza, FD [41] | Seff-medication and homeostatic behaviour in herbivores: Learning about the benefits of nature’s pharmacy | 91 | 2007 |
Hu, G and You, F [38] | Land-Use/Land-Cover Changes and Its Contribution to Urban Heat Island: A Case Study of Islamabad, Pakistan | 75 | 2020 |
Schultz, B and De Wrachien, D [33] | Irrigation and drainage systems research and development in the 21st century | 48 | 2002 |
Hu, G and You, F [40] | Renewable energy-powered semi-closed greenhouse for sustainable crop production using model predictive control and machine learning for energy management | 47 | 2022 |
Aghi, D et al. [39] | Local Motion Planner for Autonomous Navigation in Vineyards with a RGB-D Camera-Based Algorithm and Deep Learning Synergy | 47 | 2020 |
In addition, our analysis shows that five authors, Chang EJ, Chauhan BS, Deng KZ, Kava C, and Sharma A, stand out as the top contributors to the research field. Each contributed two records, comprising approximately 2.86% of the dataset. The highest number of publications appeared in the journal “Sustainability” (seven articles), followed by “Computers and Electronics in Agriculture”, “IEEE Access”, and “Sensors”, each with three articles.
In order to identify the most important research areas and themes in the field of innovative approaches to CEA, we carried out a keyword co-occurrence analysis. This allows us to track the main interests of researchers, stemming from the keywords we found in titles, abstracts, and keywords identified by the authors. Out of the 609 keywords, 79 met the threshold of appearing at least twice. A network visualization of keyword co-occurrences is presented in
Figure 4, showing 464 links and four clusters, with a Total Link Strength (TLS) of 553. In the following section, we examine these clusters individually, describing each research direction along with the related findings from the 12 fully reviewed articles.
3.1. Cluster 1 (Red): Climate-Responsive Agriculture and Soil Management via Remote Sensing
The first cluster (red) had 29 items and 43 links, with a total link strength (TLS) of 56. This cluster contains research that clearly centers on the challenges and innovations in agriculture under negative environmental effects, particularly climate change and soil degradation. The integration of remote sensing and data-driven approaches with traditional agricultural knowledge stands out.
Many keywords from the first cluster point toward issues of soil salinization, organic matter, and conservation agriculture, which suggests that researchers are focused on maintaining or improving soil health in changing environments. It is about productivity and long-term sustainability. This cluster shows how researchers keep soils viable as climate and land use change do.
The appearance of “arbuscular mycorrhizal fungi” and “biological control” adds an interesting nuance, as it shows that there is attention on biological and microbial solutions in addition to technological ones. Therefore, while remote sensing and machine learning methods like “random forest” are being used to monitor or predict outcomes, there is also a recognition that working with natural systems (like fungal networks or beneficial soil organisms) is part of the solution. Specifically, researchers employ a random forest (RF) machine learning model as the base classifier for the RUSBoost ensemble technique to predict the spatial distribution of land suitability classes for sugar beet, a key crop in specific regions [
14]. Others employed the Random Forests algorithm, a widely used machine learning classifier, to map 12 land cover classes, including different peatland types, based on multi-sensor remote sensing data [
42].
There is a strong emphasis on climate here, too, but what is interesting is how that plays into food security and efficiency. This is not just about managing crops; it is about feeding populations, especially as urbanization continues and usable agricultural land becomes more limited. It makes sense that crops like “rice” show up here, tied to food systems in climate-vulnerable regions like “China”, which is also mentioned as a keyword, pointing to the geographic location of case studies.
Our qualitative analysis reveals that the potential of machine learning models, like MaxEnt, is applied in agricultural applications in Thailand, particularly for modeling crop suitability. MaxEnt, a presence-only modeling approach, has been shown to be effective in identifying suitable areas for both upland and lowland crops, including rice varieties [
43]. However, the model’s performance was sensitive to the geographic and environmental distribution of observed crop data. This highlights the importance of data quality and distribution in the training datasets, which directly impact model reliability.
Researchers have highlighted the rapid increase in the use of plastic greenhouses (PGs) all across the planet, especially in Asia, and underscored their negative effects on the local environment. These effects include a major change in the soil’s physical and chemical makeup, disruption of the water cycle, and a decrease in biodiversity [
44]. Medium-resolution Landsat Operational Land Imager (OLI) remote-sensing data are the primary data source for PG identification. Others in China utilize images captured from remote sensing equipment (like drones or fixed cameras) to continuously monitor the greenhouse environment. These images provide detailed data on various environmental parameters, such as temperature, humidity, light intensity, and plant health [
14].
The presence of “indexes”, “images” and “satellites” suggests that remote sensing tools are central to this research domain. They are being used not just for mapping but also for making decisions: identifying areas of risk, monitoring soil quality over time, and assessing the impacts of different farming systems. The use of remote sensing and cloud-based processing techniques allows for near-real-time monitoring of forest cover changes, enhancing transparency and providing timely information to local authorities and stakeholders. Landsat 7 satellite imagery and the Google Earth Engine (GEE) platform were used for processing and analyzing the remote sensing data. The authors demonstrated the effectiveness of this approach in quantifying forest cover changes at the regional scale, overcoming the challenges of traditional desktop-based GIS software [
45].
In essence, this cluster is where agriculture intersects with remote sensing. It is framed by the need to adapt to climate pressure and preserve food systems. Technology, ecology, and sustainability have also been discussed. Various systems have been developed that integrate autonomous Unmanned Aerial Vehicles (UAVs) with machine learning (ML) frameworks to support sustainable agriculture and environmental monitoring, aligning with broader environmental conservation goals [
46]. The potential of using UAVs for various remote sensing applications, such as tree health assessment, tree population estimation, and water resource identification, is crucial for sustainable environmental monitoring. The first brain employs deep reinforcement learning (DRL) principles to enable UAV navigation in complex agricultural terrain. In contrast, the second brain utilizes the Faster R-CNN algorithm to achieve 98% accuracy in tasks like tree counting, water detection, and plant health analysis [
46].
3.2. Cluster 2 (Green): AI-Powered Vision Systems for Precision Agriculture
The second cluster (green) contained 22 items, 28 links, and a total link strength of 36. This cluster focuses on including artificial intelligence and computer vision tools in agriculture, emphasizing how these technologies are used to monitor crops and support precision management. The presence of items like “computer vision”, “deep learning”, “neural networks”, “image processing”, “classification”, “YOLOv8”, “machine vision”, and “networks” indicates the use of techniques rooted in machine learning and AI-based visual analysis. The mention of “machine learning in agriculture” and “prediction” indicates that this cluster is about more than just image recognition. Rather, it is about building intelligent systems that can analyze and interpret agricultural data in real-time to support decision-making. These systems are being trained to detect and classify different crop types, identify weeds, and even assess conditions like “soil erosion”, all based on visual input from drones or ground-based sensors.
Deep learning techniques, including convolutional neural networks (CNNs), region-based CNNs, and YOLO models, have been used for various agricultural applications, such as fruit and flower detection, crop yield estimation, and plant disease identification [
37]. This cluster highlights several emerging trends in machine vision and deep learning applications for agriculture, such as the use of hybrid techniques, integration of multiple sensor modalities, and the development of real-time automated systems for tasks like robot harvesting [
47].
One of the more interesting aspects of this cluster is how it links advanced AI methods—Like those found in “YOLOv8” and other object detection architectures with very practical, field-level applications such as “weed detection”, “crop identification” and “site-specific management”. These are not abstract models, and they are being applied to solve real problems on farms, especially in the context of “precision agriculture”. “Indoor farming” also appears, which suggests that some of these technologies are being tested or deployed in controlled environments where lighting, plant spacing, and background conditions can be optimized for vision-based systems. This may also facilitate the generation of large datasets for training neural networks.
This cluster addresses the challenge of accurately identifying
Chilo suppressalis pest, a major threat to rice crops in the temperate and subtropical regions of Asia. The proposed approach, which integrates smartphone-captured images with a deep learning-based classification model, represents a significant methodological innovation in the field of precision agriculture and intelligent pest management. Researchers have focused on developing a user-friendly, smartphone-based system that can provide rapid and accurate pest identification for farmers, which is a relevant and impactful contribution [
48]. A deep learning-based approach, specifically utilizing a 400-unit artificial neural network, is justified and grounded in the existing literature on machine learning applications in agriculture.
Most of these papers aim to improve the accuracy and efficiency of agricultural monitoring, either through the early detection of crop stress or more targeted use of inputs. Therefore, terms like “management”, “prediction”, “site-specific management”, and “sustainable agriculture” appear together. This cluster links technical solutions with overall agricultural goals, especially regarding sustainability and resource optimization.
It is also worth mentioning that the keyword “UAV” suggests a strong role for drone-based imaging in this space. UAVs provide a flexible and scalable platform for collecting high-resolution visual data, which is essential for training and deploying many of the AI models mentioned in this cluster. The proposed system architecture and POMDP-based motion planner aim to improve the reliability and robustness of UAV operations in the face of environmental uncertainty and imprecise object detection, contributing to the more sustainable and efficient use of these technologies [
49].
Other studies have utilized a multi-sensor approach combining optical (Sentinel-2), C-band SAR (Sentinel-1), and L-band SAR (ALOS PALSAR) data, along with topographic information, to accurately map peatlands and distinguish between undisturbed and disturbed (pasture) peatland classes [
42]. Overall, this cluster highlights a growing area of research at the intersection of agriculture, AI, and imaging technologies. The focus is on practical applications, but it’s grounded in cutting-edge developments in “machine learning”, “computer vision” and neural network architecture. These tools are used to make agriculture smarter, more precise, and better adapted to future challenges.
3.3. Cluster 3 (Blue): Digital Technologies and System Optimization in Sustainable Farming
Cluster 3 (blue) has 15 items, and 32 links, with a total link strength (TLS) of 42. This cluster contains research centered on the digital background of modern agriculture and how farms are being equipped with connected technologies to become smarter, more efficient, and more sustainable. The most prominent terms, such as “internet of things”, “wireless sensor networks”, “big data” and “GIS” show a clear focus on connectivity, real-time monitoring, and data integration at the field level. The integration of remote sensing data with GIS and geostatistical techniques has demonstrated the effectiveness of these technologies in characterizing the spatial variability of soil properties and land suitability [
50]. The collaboration between Ain Shams University (Cairo, Egypt), the National Authority for Remote Sensing and Space Sciences (Cairo, Egypt), and the Institute of Olive Tree, Subtropical Crops, and Viticulture (Crete, Greece) suggests a strong connection between academic research and practical applications in the field of sustainable agriculture.
Additionally, “smart agriculture” and “smart farming” are prevalent. Papers in this cluster typically deal with infrastructure and system designs that allow farms to function more intelligently. Sensors placed in soil or on plants, connected via wireless networks, feed data back into systems that optimize decisions regarding “irrigation”, “fertilization”, and environmental control.
Smart farming can provide farmers with data on topography, climate, soil parameters, and other factors to optimize the use of fertilizers, pesticides, and other inputs, thereby improving sustainability [
47]. Precision agriculture and automation in the agricultural industry have the potential to moderate resource usage and increase food quality in the post-pandemic world.
“Machine learning” is also a key part of this cluster, particularly when it comes to analyzing sensor data or performing tasks like “object-based classification” of spatial imagery. The mention of “accuracy” and “optimization” suggests that much of the research here is not just about building these systems but refining them, ensuring that automation is precise, reliable, and scalable. When comparing machine learning algorithms—Cart, SVM, and RDF— in extracting PG distribution from remote sensing data, the results stipulate that the SVM classifier has the highest accuracy in identifying PGs, outperforming the other two methods in terms of recall, precision, and F-measure [
34].
Another interesting component is the appearance of “hydroponics”. In these systems, digital tools can monitor every aspect of plant growth, including light, nutrients, and temperature, and optimize them using feedback mechanisms. This connects to the overall goal of “sustainable development”, which is also a core keyword in the cluster. The presence of “design” and “state-of-the-art” implies that many of the papers here are not just applied studies but also proposals for new architectures or system frameworks. Essentially, it lays out how digital agriculture should be built from a technical standpoint.
What separates this cluster from the previous one (which focused more on AI and vision systems) is that the emphasis here is on the systemic. It involves infrastructure, data flow, optimization algorithms, and the integration of hardware and software components. It is less about individual tasks (like detecting weeds) and more about making the entire farming system run more efficiently and sustainably through technology. Overall, this cluster includes the current push toward digitally connected agriculture that is data-driven, emphasizing IoT-based systems, sensor networks, and intelligent infrastructure designed for sustainability and resource optimization.
3.4. Cluster 4 (Yellow): Intelligent Systems for Monitoring and Enhancing CEA Performance
Cluster 4 (yellow) has 13 items, and 11 links with a total link strength (TLS) of 11. This cluster revolves around the use of artificial intelligence and data modeling to improve the growth of crops in controlled environments in spaces likeas greenhouses, vertical farms, and hydroponic systems, where environmental variables can be more manageable. The central theme here is the optimization of resources, plant growth, and efficiency. In this cluster, the efficiency of high-resolution Sentinel-2 satellite imagery and machine learning models is highlighted in enabling real-time, spatially explicit monitoring of pasture biomass parameters within small-scale controlled environments [
51]. Articles from our qualitative analysis showcase how these remote sensing technologies can support precision agriculture and data-driven decision-making in CEA by providing timely and accurate information to optimize production processes. In contrast, proximal sensing, which involves sensors placed in close contact with plants or soils (e.g., chlorophyll meters, hyperspectral cameras, or root zone sensors), provides high-resolution, real-time data under artificial lighting and controlled conditions, making it critical for precision monitoring in greenhouses or vertical farms. Although the term does not explicitly appear in our dataset, the technologies described in Cluster 4 conceptually align with proximal sensing methods.
Terms like “controlled environment agriculture”, “artificial intelligence”, “sensors” and “model” refer to research that relies on data collection and algorithmic control. These systems use real-time feedback from environmental sensors to regulate important variables such as “temperature”, “climate”, “water” and “nitrogen uptake.” The goal is clear: to fine-tune the growing conditions to maximize yield and reduce waste.
Notably, researchers seem to be interested in using artificial intelligence-driven techniques to use auxiliary data sources to improve the accuracy and cost reduction of geochemical mapping. A wide range of high-resolution auxiliary data sources, including airborne geophysical surveys, aerial elevation data, land gravity surveys, and satellite imagery, have been mentioned [
52]. Savants discussed the use of quantile regression forests, an advanced machine learning technique, to model and predict soil geochemical properties based on these auxiliary datasets.
The presence of “principal component analysis” suggests a strong data analytics component. Papers in this cluster use statistical modeling and dimensionality reduction to better understand the complex relationships between growing conditions and plant responses. These insights are then fed into predictive or control models powered by AI.
An innovative approach has emerged in which researchers have developed a soft robotic gripper with integrated proprioceptive sensing that can be applied in a wide range of robotic applications, including manufacturing, logistics, and assistive technologies. The authors demonstrated the effectiveness of the soft gripper design and its ability to adapt to various object shapes and sizes, highlighting its potential for improving the performance and versatility of robotic systems [
53]. This can contribute to the development of more sustainable and energy-efficient robotic systems for various applications.
During our qualitative analysis, we identified another proposed smartphone-based system that aligns with the principles of precision agriculture. This system would enable efficient and targeted pest management practices to reduce crop damage and minimize the environmental impact of pesticide use [
48].
The term “Supply chain” points to a broader view in some studies, linking production inside controlled environments to logistics and distribution. This suggests that some studies are not only looking at plant science or engineering but also considering how CEA fits into sustainable food systems. Overall, this cluster is about intelligent systems that make growing food in controlled environments more efficient, responsive, and smart. It is where AI, sensor technology, and plant physiology come together, pushing forward the kind of agriculture that is less dependent on external climate and more integrated with digital infrastructure.
3.5. Research Trends
The overlay visualization of the keyword co-occurrence analysis, presented in
Figure 5, highlights a noticeable shift in the research focus over recent years toward the integration of artificial intelligence, specifically machine learning and deep learning, into agriculture. More recent keywords (represented in yellow tones), such as YOLOv8, weed detection, and machine learning in agriculture, indicate increased attention to computer vision and real-time data applications for tasks like crop monitoring and automated decision-making. Central themes such as precision agriculture, remote sensing, and climate change remain central, connecting both technological and environmental concerns.
In contrast, earlier research (2018–2020, shown in blue tones) was more concentrated on traditional agronomic topics like organic matter, arbuscular mycorrhizal fungi, nitrogen uptake, and irrigation. In recent years, there has been a growing emphasis on indoor farming, hydroponics, and wireless sensor networks, pointing to an extended agricultural setting and the emerging role of Internet of Things (IoT) solutions. The presence of terms such as optimization, big data, and networks further underscores the data-driven nature of this field.
4. Discussion
This study provides a comprehensive overview of the literature on the integration of Machine Learning (ML) and Remote Sensing (RS) technologies into Controlled Environment Agriculture (CEA), with a focus on their contributions to sustainability, resource optimization, and system efficiency. Through bibliometric and content analyses, we answered our two core research questions by identifying the key contributors, emerging trends, and practical applications of RS and ML in the context of sustainable CEA systems.
Our bibliometric analysis revealed a clear upward trend in publications related to ML and RS in CEA systems. Meaningful academic engagement on this topic began after 2019, when technological advances, particularly in AI, sensor networks, and automation, enabled the development of more sophisticated applications. Despite this positive trend, the overall volume of research remains modest, indicating that this field is still in its developmental phase. This early stage presents substantial opportunities for future exploration, especially in areas where ML and RS intersect with climate resilience and sustainable farming technologies.
Our findings indicate a clear evolution from traditional agricultural research toward technology-driven approaches, particularly those grounded in artificial intelligence (AI), machine learning (ML), and data-centric strategies. The bibliometric analysis identified four dominant research clusters that align closely with the transition to smart agriculture and climate-resilient practices, thereby addressing RQ1 and RQ2 regarding major trends and influential works in the field.
Each cluster has distinct advantages and challenges. Cluster 1 integrates remote sensing and ecological approaches to enhance climate adaptation and soil management, although it may face data quality sensitivities and environmental concerns related to certain technologies like plastic greenhouses. Cluster 2 focuses on AI-powered vision for precise crop and pest monitoring, showing high accuracy with deep learning models but requiring large datasets and advanced technical capacity. Cluster 3 centers on IoT and big data for smart farming, improving resource use through sensor networks, yet faces integration complexity and adoption barriers due to the cost and skills required. Cluster 4 focuses on intelligent systems in controlled environments, offering real-time optimization and innovative automation, though substantial investment and model generalizability remain challenges. Evaluations across clusters highlight that while each approach advances sustainable agriculture, its performance and scalability depend strongly on data quality, technical capacity, and contextual factors.
A significant number of recent publications emphasize precision agriculture, computer vision, and real-time data monitoring, underlining the growing importance of AI technologies in achieving more efficient and adaptive agricultural systems. Tools like YOLOv8, which are used for object detection and real-time image analysis, are at the forefront of efforts to automate processes such as weed and pest identification, directly supporting decision-making in both open fields and indoor farming contexts.
Aligned with RQ3, our content analysis reveals that ML and RS are increasingly applied to enhance resource efficiency and environmental sustainability in CEA settings. In greenhouses, vertical farms, and hydroponic systems, sensor networks and feedback loops enable real-time adjustments to temperature, humidity, and nutrient levels. These innovations facilitate higher productivity while minimizing the use of water, fertilizer, and pesticides. Moreover, the shift toward IoT-based systems and big data analytics reflects a broader trend toward interconnected farming environments. These technologies offer opportunities to optimize irrigation and fertilization while enabling predictive models for yield estimation, plant stress detection, and crop growth forecasting. The presence of terms such as optimization, networks, and automation in the literature indicates a strong emphasis on sustainability through data-driven control.
Our review also highlights the diversification of agricultural environments, moving beyond traditional soil-based farming to include indoor farming, aquaponics, and modular vertical agriculture. These systems are particularly promising in urban areas or regions facing climate stress where conventional agriculture is not feasible. However, their implementation still faces practical constraints, including high energy demands and economic barriers, especially in resource-limited regions.
Despite substantial technological progress, the literature has limited attention to socio-economic dimensions, including farmer readiness, labor market transitions, and policy support. This gap hinders the widespread adoption of CEA and should be prioritized in future research. A more holistic research agenda that incorporates economic viability, training programs, and community engagement is essential for scaling these innovations in a sustainable and inclusive manner.
From a theoretical perspective, this study provides a structured overview of RS and ML applications in CEA. From a practical standpoint, this study provides actionable insights for policymakers and stakeholders seeking to scale smart agricultural solutions. Recognizing the need to balance innovation with accessibility, especially in emerging economies, we recommend that future research adopts a more holistic lens that incorporates economic feasibility, training, and community-level engagement. There is also a growing need to bridge the gap between theoretical development and field deployment, particularly in low-resource settings where CEA could offer significant benefits for food security. Moreover, we consider it important to investigate the long-term impacts on biodiversity, labor markets, and urban infrastructure.
While our study provides a timely synthesis of the literature, it has some limitations. The dataset is limited to 70 documents from the Web of Science (WoS), which, while rigorous, may not fully capture the breadth of research on this topic. One limitation of our approach is that the topic-based search strategy while ensuring relevance and focus, may have excluded relevant studies that do not explicitly mention “controlled environment agriculture” in their indexed fields. Therefore, some relevant technologies and terms related to RS and ML may not have been included as prominent keywords in our bibliometric analysis. Furthermore, this analysis reflects the state of research as of 15 April 2025 and may not capture the latest developments in rapidly evolving fields such as deep learning or spectral analysis in agriculture. Nevertheless, this study successfully achieved its stated objectives by providing a structured synthesis of existing research and identifying key trends, contributors, and technological directions in the use of Machine Learning and Remote Sensing for sustainable Controlled Environment Agriculture.