Applications of Machine Learning Technology in Agricultural Data Mining

Vlaicu, Petru Alexandru; Matei, Basarab

doi:10.3390/app15105286

Open AccessEditorial

Applications of Machine Learning Technology in Agricultural Data Mining

by

Petru Alexandru Vlaicu

^1,*

and

Basarab Matei

^2,*

¹

Feed and Food Quality Department, National Research and Development Institute for Animal Biology and Nutrition, 077015 Ilfov, Romania

²

Laboratoire d’Informatique de Paris Nord, Institut Galilée, Université Sorbonne Paris Nord (USPN), 99 Avenue Jean-Baptiste Clément, 93430 Paris, France

^*

Authors to whom correspondence should be addressed.

Appl. Sci. 2025, 15(10), 5286; https://doi.org/10.3390/app15105286

Submission received: 24 April 2025 / Accepted: 7 May 2025 / Published: 9 May 2025

(This article belongs to the Special Issue Applications of Machine Learning Technology in Agricultural Data Mining)

Download Versions Notes

1. Introduction

The global agricultural sector is undergoing a revolutionary transformation with the growing integration of machine learning (ML) technologies into traditional farming and agronomic practices. The demands for improved productivity, climate-resilient crops, optimum resource utilization, and sustainable agriculture have pushed the adoption of sophisticated computational models, with ML emerging as the cornerstone of this shift.

ML is defined as a subset of artificial intelligence (AI), which has emerged as a transformative tool in numerous industries and fields, including agriculture [1]. The ability of ML algorithms to process and analyze vast amounts of data, identify patterns, and make predictions or decisions based on different patterns has made it an invaluable resource in this domain. These methods enable pattern recognition, predictive modeling, and machine-based decision-making, significantly enhancing the precision and efficiency of farming activities. The combination of ML with agricultural data mining offers new avenues for optimizing yield prediction, crop health monitoring, soil analysis, pest detection, and climate impact assessment [2].

The benefits of using ML in agriculture are extensive. First, ML models provide predictive analytics for predicting yields and determining risk under fluctuating environmental conditions [3]. For example, convolutional neural networks (CNNs) were used successfully to detect crop diseases from leaf images with more than 95% accuracy in some studies [4]. Secondly, resource optimization can be achieved through ML algorithms that guide irrigation scheduling, nutrient application, and pesticide treatment, thereby reducing input prices and minimizing environmental impacts [5].

The impact of using ML technologies reveals revolutionizing potential for both smallholder and industrial-scale commercial agriculture. Mobile applications enabled through ML have benefited Indian farmers by identifying pests in their crop and enabling them to derive agronomical inputs tailored toward increased productivity as well as resiliency [6]. For developed regions, field conditions during real time are monitored with the help of ML-enabled systems along with remote sensing, with related micro-level intervention and the least out-of-scale output [7].

Moreover, unsupervised learning and clustering procedures have played an essential role in agronomic zoning, identifying homogenous farming regions based on climatic and soil conditions, in order to improve policymaking as well as extension services [8]. In livestock production systems, ML has played an essential role in monitoring animal welfare, behavior, and productivity through sensor-based technologies. For example, in farming production systems, dietary formulations are one of the most significant factors influencing both the economic viability of operations and the health and productivity of poultry animals. Traditionally, diet optimization relies on empirical methods and linear programming techniques, which, while effective, often fall short in accommodating the complex, nonlinear interactions between feed ingredients, bird physiology, and environmental factors. ML algorithms, such as artificial neural networks (ANNs), support vector machines (SVMs), and ensemble methods, have demonstrated superior capabilities in modeling these interactions [9]. By leveraging large datasets, these algorithms can predict the optimal nutrient composition for different growth stages, genotypes, and production purposes. Moreover, ML can aid in identifying alternative feed ingredients, such as insect protein or agricultural by-products, which nowadays are of great interest in reducing dependency on conventional resources like soybean meal, which is both economically and environmentally expensive [10]. Furthermore, the increasing presence of agricultural data—from satellite images and drone data to Internet of Things (IoT) sensor networks and genomics data—has itself gone on to fuel further ML research and implementations. However, challenges still exist in the form of data heterogeneity, lack of enough labeled datasets, and the need for interpretable models that support human decision-making [11].

The quality of animal origin products is another area where ML has made substantial contributions [12]. Consumer preferences and market standards necessitate precise quality control measures, which traditionally involve labor-intensive and subjective assessment methods. ML offers automated, non-invasive, and highly accurate alternatives, such as computer vision systems powered by CNNs that can analyze images of eggs or carcasses to detect defects, classify grades, or even predict shelf life [13]. Similarly, Alvarez-García et al. [14] showed that spectroscopic data, combined with ML algorithms, has been used to assess meat quality attributes such as tenderness, water-holding capacity, and chemical composition. These advancements not only improve the efficiency and accuracy of quality control processes, but also enable the industry to meet the growing demand for the traceability and transparency of agricultural products through data mining.

2. Overview of Published Articles

This Special Issue entitled ‘Applications of Machine Learning Technology in Agricultural Data Mining’ has published six research papers and one perspective, which, together, contribute to advances that demonstrate the real-world value of ML and AI technologies in agriculture. These papers addressed various aspects of ML usage in agricultural data.

Tanchev et al. [15] delved into CNNs, aiming to interpret and validate the model’s decision-making process in the context of cow identification, a task that requires distinguishing subtle visual features among animals with similar appearances. Their study contained 760 images of 168 Holstein breed dairy cows, taken in several farms, situated in Southern Bulgaria, aiming to create a model for the visual face recognition of farm animals (cows) that could be used in future applications to manage the health, welfare, and productivity of the animals at the herd and individual levels in real time. By analyzing these attributions, the authors reveal that when examples are classified successfully, CNNs focus on pertinent cow-specific features such as facial structure and coat patterns. The study also uncovers that in some instances, models rely more on background elements, such as fences, barn walls, or lighting, rather than on cows themselves. This unintended functionality indicates potential biases in training data and argues for careful dataset design and model validation. What sets this research apart is its commitment to transparency and interpretability. Their results are comparable to other research studies on image classification, where accuracies of 92.9% and 91.66% were reported when using artificial neural networks [16,17]. By using explainable AI (XAI) techniques for livestock identification, the researchers set a benchmark for the responsible and successful application of AI in agriculture, concluding that performance metrics alone are not enough, and that these techniques are required to elucidate how an AI model makes predictions.

Kostić et al. [18] introduced a novel approach that combines unmanned aerial vehicle (UAV) imagery with advanced ML to assess and enhance seeding precision, to achieve uniform plant spacing, a pivotal aspect for optimal corn yields. Conducted in Serbia, the research involved the capture of high-resolution aerial images of corn fields in V4 growth using a DJI Inspire 1 drone. Three approaches for plant detection were compared: template matching, unsupervised VARI-based segmentation, and Mask R-CNN deep learning network. The best performance was attained by the Mask R-CNN, which was able to identify 96.5% of the corn plants correctly, and with high accuracy, also replicate ground-truth spacing measurements. The study further tested the influence of seeder vibration on seed planting accuracy. The data indicated that greater amplitudes of vibration translate into more misses at planting and correspond to faster seeding speeds in the majority of cases. The study provides evidence on why improved seeder designs or adaptation of the control for speed that mitigates its impacts are critical. This study reveals that the integration of UAV imaging and AI analysis offers a potential solution to enhance seeding accuracy, enhance plant density optimization, and ultimately enhance corn production efficiency, which aligns with other previous reports [19].

In crop production, moisture prediction means better energy management, fewer quality losses, and greater sustainability in grain processing, factors with great importance for engineers, farmers, processors, and consumers alike. Xing et al. [20] present a refined approach to predicting the moisture content of corn through the integration of ML and advanced variable selections, a key factor in ensuring optimal drying performance. The study included collected temperature and humidity data from 18 sensor points in a batch-type corn dryer by using three predictive models: Multiple Linear Regression (MLR), Extreme Learning Machine (ELM), and Long Short-Term Memory (LSTM). The authors showed that the UVE-enhanced ELM model outperformed the rest, achieving an R² of 0.946 and an RMSE of 0.581%, while the UVE-LSTM model also performed strongly, with an R² of 0.934. As concluded by the authors, this result provides a more scientific method for predicting corn’s drying moisture content and lays the foundation for using the prediction model to guide actual production.

Furthermore, in agriculture, precision irrigation technology has become the core of modern agriculture, especially IoT, Big Data and AI. Precise prediction of irrigation water uses in the maize industry through Ensemble Kalman Filter (EnKF) and fuzzy optimization methods combined with the Decision Support System for Agrotechnology Transfer (DSSAT) model, by using remote sensing data of land moisture and leaf area provided by Google Earth Engine (GEE), to make predictions through the EnKF-DSSAT and fuzzy optimization-DSSAT models, showed interesting results [21]. These models achieved high accuracy in short-term (98.11%) and long-term (97.78%) forecasts, being reported as significantly better than the traditional models. It was concluded that this new approach can shows significant advantages and effectively improve the accuracy of maize irrigation water demand prediction, especially in solving the problems of poor accuracy and low resource utilization of traditional irrigation systems.

In the fruit industry, determining peach firmness is critical for optimizing harvesting time and fruit quality, by exploring the enhancement of Classification and Regression Tree (CART) models using metaheuristic algorithms to improve the prediction accuracy of peach firmness. The study used a variety of metaheuristic algorithms, including the Genetic Algorithm (GA), Bat Algorithm (BA), Differential Evolution (DE), Particle Swarm Optimization (PSO) to optimize the CART model parameters. The results showed that CART model optimized using GA gave the most accurate predictions on the new data. This empirical experiment shows that by using various metaheuristic optimization techniques implemented in metaheuristicOpt and GA R libraries it is possible to improve the accuracy of the default CART model. It is important to note that the experiment is limited by the used R libraries and the optimization approach previously described. Hence different results could be obtained if other software packages or optimization approaches are used, as the authors mentioned previously [22].

With agriculture becoming increasingly robotic and automated, human activity in the field needs to be better comprehended for safety, efficiency, and unobstructed human–robot collaboration. According to Benos et al. [23] human-robotic synergetic systems are considered to be the most mature way to circumvent problems appearing due to the complex and unpredictable nature of the agricultural environment, which contrasts with the stable domain found in industrial settings. They have described the fundamentals of human-robot interaction (HRI) from an agriculture-oriented perspective, which tried to identify potential hazards that can put human safety at risk. The, inertial measurement units (IMUs) which are electronic devices that measure physical motion parameters such as rotational rate and acceleration by using accelerometers and gyroscopes, have been also discussed. It was concluded that a combined approach is foreseen that reprioritizes safety measures, redetermines practical limits, identifies the riskiest postures and increases the awareness of the risk factors. It was also suggested that specialists from different fields (ergonomists, physicians, engineers, manufacturers, IoT developers and international organizations) need to intensively and systematically collaborate to optimize and safeguard occupational health and safety in this demanding and promising field.

In the realm of aquaculture, effectively managing and understanding disease outbreaks is paramount. Traditional methods often struggle with the overlapping relationships, specialized terminology, and imbalanced datasets prevalent in the literature on aquatic disease (Ye et al. [24]). To address these challenges, Ye et al. developed Fd-CasBGRel, a model that integrates the CasRel framework with a BRC feature fusion module. This module combines self-attention mechanisms, BiLSTM, relative position encoding, and conditional layer normalization to capture nuanced contextual information. Additionally, the model employs the Gradient Harmonizing Mechanism (GHM) loss function to mitigate the effects of category imbalance. Evaluated on a curated aquatic disease corpus, Fd-CasBGRel achieved an F1 score of 84.71%, surpassing several benchmark models. Notably, it attained an F1 score of 86.52% on datasets with overlapping relationships, demonstrating its robustness in handling complex data structures. Furthermore, tests on the publicly available WebNLG dataset confirmed the model’s generalization capabilities. The effectiveness of the model proposed by the authors, was compared with several other leading entity relationship extraction models on the AquaticDiseaseRE dataset and the WebNLG dataset [25,26,27]. It was concluded that although the location features play a pivotal role, the study acknowledges that other features, both lexical and syntactic, remain underexploited. Fd-CasBGRel can potentially contribute to the construction of comprehensive knowledge graphs in the aquatic disease domain, facilitating better disease management and research, which require further investigation.

Author Contributions

Conceptualization, P.A.V. and B.M.; writing—original draft preparation, P.A.V.; writing—review and editing, P.A.V. and B.M.; visualization, P.A.V. and B.M. All authors have read and agreed to the published version of the manuscript.

Acknowledgments

The Guest Editors would like to thank all the authors and peer reviewers for their valuable contributions to this Special Issue. Congratulations to all authors regarding what the final decisions of the submitted manuscripts were; the feedback, comments, and suggestions from the reviewers and editors helped the authors to improve their papers. We would like to take this opportunity to record our sincere gratitude to all the reviewers and editors of the journal.

Conflicts of Interest

The authors declare no conflict of interest.

References

Shaikh, T.A.; Rasool, T.; Lone, F.R. Towards leveraging the role of machine learning and artificial intelligence in precision agriculture and smart farming. Comput. Electron. Agric. 2022, 198, 107119. [Google Scholar] [CrossRef]
Kamilaris, A.; Prenafeta-Boldú, F.X. Deep learning in agriculture: A survey. Comput. Electron. Agric. 2018, 147, 70–90. [Google Scholar] [CrossRef]
Jeong, J.H.; Resop, J.P.; Mueller, N.D.; Fleisher, D.H.; Yun, K.; Butler, E.E.; Timlin, D.J.; Shim, K.-M.; Gerber, J.S.; Reddy, V.R.; et al. Random forests for global and regional crop yield predictions. PLoS ONE 2016, 11, e0156571. [Google Scholar] [CrossRef]
Abade, A.; Ferreira, P.A.; de Barros Vidal, F. Plant diseases recognition on images using convolutional neural networks: A systematic review. Comput. Electron. Agric. 2021, 185, 106125. [Google Scholar] [CrossRef]
Getahun, S.; Kefale, H.; Gelaye, Y. Application of precision agriculture technologies for sustainable crop production and environmental sustainability: A systematic review. Sci. World J. 2024, 2024, 2126734. [Google Scholar] [CrossRef]
Mittal, S.; Mehar, M. Socio-economic factors affecting adoption of modern information and communication technology by farmers in India: Analysis using multivariate probit model. J. Agric. Educ. Ext. 2016, 22, 199–212. [Google Scholar] [CrossRef]
Mulla, D.J. Twenty five years of remote sensing in precision agriculture: Key advances and remaining knowledge gaps. Biosyst. Eng. 2013, 114, 358–371. [Google Scholar] [CrossRef]
Reyes, F.; Casa, R.; Tolomio, M.; Dalponte, M.; Mzid, N. Soil properties zoning of agricultural fields based on a climate-driven spatial clustering of remote sensing time series data. Eur. J. Agron. 2023, 150, 126930. [Google Scholar] [CrossRef]
Noorain; Srivastava, V.; Parveen, B.; Parveen, R. Artificial intelligence in drug formulation and development: Applications and future prospects. Curr. Drug Metab. 2023, 24, 622–634. [Google Scholar] [CrossRef]
Helmy, M.; Elhalis, H.; Liu, Y.; Chow, Y.; Selvarajoo, K. Perspective: Multiomics and machine learning help unleash the alternative food potential of microalgae. Adv. Nutr. 2023, 14, 1–11. [Google Scholar] [CrossRef]
Temitope Olayinka, O. Revolutionalizing Agriculture: The Role of IoT, Artificial Intelligence and Advanced Analytics. Int. J. Innov. Sci. Res. Technol. 2025, 10, 2665–2668. [Google Scholar] [CrossRef]
Li, J.; Qian, J.; Chen, J.; Ruiz-Garcia, L.; Dong, C.; Chen, Q.; Liu, Z.; Xiao, P.; Zhao, Z. Recent advances of machine learning in the geographical origin traceability of food and agro-products: A review. Compr. Rev. Food Sci. Food Saf. 2025, 24, e70082. [Google Scholar] [CrossRef] [PubMed]
Nakaguchi, V.M.; Abeyrathna, R.R.D.; Ahamed, T. Development of a new grading system for quail eggs using a deep learning-based machine vision system. Comput. Electron. Agric. 2024, 226, 109433. [Google Scholar] [CrossRef]
Alvarez-García, W.Y.; Mendoza, L.; Muñoz-Vílchez, Y.; Nuñez-Melgar, D.C.; Quilcate, C. Implementing artificial intelligence to measure meat quality parameters in local market traceability processes. Int. J. Food Sci. Technol. 2024, 59, 8058–8068. [Google Scholar] [CrossRef]
Tanchev, D.; Marazov, A.; Balieva, G.; Lazarova, I.; Rankova, R. Exploring Attributions in Convolutional Neural Networks for Cow Identification. Appl. Sci. 2025, 15, 3622. [Google Scholar] [CrossRef]
Yoon, Y.; Hwang, T.; Lee, H. Prediction of radiographic abnormalities by the use of bag-of-features and convolutional neural networks. Vet. J. 2018, 237, 43–48. [Google Scholar] [CrossRef]
Salvi, M.; Molinari, F.; Iussich, S.; Muscatello, L.V.; Pazzini, L.; Benali, S.; Banco, B.; Abramo, F.; De Maria, R.; Aresu, L. Histopathological classification of canine cutaneous round cell tumors using deep learning: A multi-center study. Front. Vet. Sci. 2021, 8, 640944. [Google Scholar] [CrossRef]
Kostić, M.M.; Grbović, Ž.; Waqar, R.; Ivošević, B.; Panić, M.; Scarfone, A.; Tagarakis, A.C. Corn Plant In-Row Distance Analysis Based on Unmanned Aerial Vehicle Imagery and Row-Unit Dynamics. Appl. Sci. 2024, 14, 10693. [Google Scholar] [CrossRef]
Badua, S.A.; Sharda, A.; Strasser, R.; Ciampitti, I. Ground speed and planter downforce influence on corn seed spacing and depth. Precis. Agric. 2021, 22, 1154–1170. [Google Scholar] [CrossRef]
Xing, S.; Lin, Z.; Gao, X.; Wang, D.; Liu, G.; Cao, Y.; Liu, Y. Research on Outgoing Moisture Content Prediction Models of Corn Drying Process Based on Sensitive Variables. Appl. Sci. 2024, 14, 5680. [Google Scholar] [CrossRef]
Yu, Y.; Luo, Y.; Wang, X.; Wang, X.; Hu, C. Precise Assimilation Prediction of Short-term and Long-term Maize Irrigation Water Based on EnKF-DSSAT and Fuzzy Optimization-DSSAT Models. IEEE Access 2025, 13, 27150–27166. [Google Scholar] [CrossRef]
Ivanovski, T.; Zhang, X.; Jemric, T.; Gulic, M.; Matetic, M. Peach Firmness Prediction Using Optimized Regression Trees Models. In Proceedings of the 33rd DAAAM International Symposium, Vienna, Austria, 27–28 October 2022; Katalinic, B., Ed.; DAAAM International: Vienna, Austria, 2022; pp. 0480–0489, ISBN 978-3-902734-36-5. [Google Scholar] [CrossRef]
Benos, L.; Bechar, A.; Bochtis, D. Safety and ergonomics in human-robot interactive agricultural operations. Biosyst. Eng. 2020, 200, 55–72. [Google Scholar] [CrossRef]
Ye, H.; Lv, L.; Zhou, C.; Sun, D. Fd-CasBGRel: A Joint Entity–Relationship Extraction Model for Aquatic Disease Domains. Appl. Sci. 2024, 14, 6147. [Google Scholar] [CrossRef]
Zheng, S.; Wang, F.; Bao, H.; Hao, Y.; Zhou, P.; Xu, B. Joint Extraction of Entities and Relations Based on a Novel Tagging Scheme. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Vancouver, BC, Canada, 30 July–4 August 2017. [Google Scholar]
Wei, Z.; Su, J.; Wang, Y.; Tian, Y.; Chang, Y. A novel cascade binary tagging framework for relational triple extraction. arXiv 2019, arXiv:1909.03227. [Google Scholar]
Zheng, H.; Wen, R.; Chen, X.; Yang, Y.; Zhang, Y.; Zhang, Z.; Zhang, N.; Qin, B.; Ming, X.; Zheng, Y. PRGC: Potential Relation and Global Correspondence Based Joint Relational Triple Extraction. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Virtual, 1–6 August 2021; pp. 6225–6235. [Google Scholar]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Vlaicu, P.A.; Matei, B. Applications of Machine Learning Technology in Agricultural Data Mining. Appl. Sci. 2025, 15, 5286. https://doi.org/10.3390/app15105286

AMA Style

Vlaicu PA, Matei B. Applications of Machine Learning Technology in Agricultural Data Mining. Applied Sciences. 2025; 15(10):5286. https://doi.org/10.3390/app15105286

Chicago/Turabian Style

Vlaicu, Petru Alexandru, and Basarab Matei. 2025. "Applications of Machine Learning Technology in Agricultural Data Mining" Applied Sciences 15, no. 10: 5286. https://doi.org/10.3390/app15105286

APA Style

Vlaicu, P. A., & Matei, B. (2025). Applications of Machine Learning Technology in Agricultural Data Mining. Applied Sciences, 15(10), 5286. https://doi.org/10.3390/app15105286

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Applications of Machine Learning Technology in Agricultural Data Mining

1. Introduction

2. Overview of Published Articles

Author Contributions

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI