Evaluation of Learning-Based Models for Crop Recommendation in Smart Agriculture

Bakr, Muhammad Abu; Khan, Ahmad Jaffar; Khan, Sultan Daud; Zafar, Mohammad Haseeb; Ullah, Mohib; Ullah, Habib

doi:10.3390/info16080632

Open AccessArticle

Evaluation of Learning-Based Models for Crop Recommendation in Smart Agriculture

by

Muhammad Abu Bakr

¹,

Ahmad Jaffar Khan

¹,

Sultan Daud Khan

²

,

Mohammad Haseeb Zafar

³

,

Mohib Ullah

^4,*

and

Habib Ullah

⁵

¹

Department of Electrical Engineering, National University of Technology, Islamabad 44000, Pakistan

²

Department of Computer Science, National University of Technology, Islamabad 44000, Pakistan

³

Cardiff School of Technologies, Cardiff Metropolitan University, Cardiff CF5 2YB, UK

⁴

Intelligent Systems and Analytics (ISA) Research Group, Department of Computer Science (IDI), Norwegian University of Science and Technology (NTNU), 2815 Gjøvik, Norway

⁵

Faculty of Science and Technology, Norwegian University of Life Sciences, 1433 Ås, Norway

^*

Author to whom correspondence should be addressed.

Information 2025, 16(8), 632; https://doi.org/10.3390/info16080632

Submission received: 2 June 2025 / Revised: 9 July 2025 / Accepted: 14 July 2025 / Published: 24 July 2025

(This article belongs to the Special Issue Natural Language Processing (NLP) with Applications and Natural Language Understanding (NLU))

Download

Browse Figures

Versions Notes

Abstract

The use of intelligent crop recommendation systems has become crucial in the era of smart agriculture to increase yield and enhance resource utilization. In this study, we compared different machine learning (ML), and deep learning (DL) models utilizing structured tabular data for crop recommendation. During our experimentation, both ML and DL models achieved decent performance. However, their architectures are not suited for setting up conversational systems. To overcome this limitation, we converted the structured tabular data to descriptive textual data and utilized it to fine-tune Large Language Models (LLMs), including BERT and GPT-2. In comprehensive experiments, we demonstrated that GPT-2 achieved a higher accuracy of 99.55% than the best-performing ML and DL models, while maintaining precision of 99.58% and recall of 99.55%. We also demonstrated that GPT-2 not only keeps up competitive accuracy but also offers natural language interaction capabilities. Due to this capability, it is a viable option to be used for real-time agricultural decision support systems.

Keywords:

smart agriculture; large language models (LLMs); crop recommendation; GPT-2; BERT

1. Introduction

The world’s economy depends heavily on agriculture, which produces food, fiber, and other necessities for human beings [1]. However, as the world population grows, climate change and the importance of soil and fertilizers in sustaining soil health and maximizing crop yields become increasingly important. The agricultural sector is facing tremendous challenges in meeting the rising food demand [2,3]. It is commonly known that crop growth and general agricultural sustainability can be greatly improved by precisely applying fertilizers and choosing the correct type of soil. Nevertheless, the traditional farming industry has long been plagued by a lack of intelligent suggestions; hence, systems are frequently predicated on broad principles, historical understanding, and limited experimentation [4]. The traditional approaches fail to take into account the unique requirements of different crops and fields, frequently resulting in inefficient resource allocation, higher farming costs, and less favorable environmental conditions [5].The agriculture industry is poised for change in the face of these obstacles, with the need to transform farming methods becoming urgent.

A potential answer for all these issues is precision agriculture (PA), a subset of smart agriculture that can increase the sustainability and efficiency of farming methods [6]. To maximize agricultural operations, PA employs data-driven methods and cutting-edge technology [7,8,9]. The utilization of sensors, drones, satellite navigation technologies, and data analytics are often crucial components of precision agriculture [10,11]. To increase farming operations’ productivity, efficiency, and sustainability through a more accurate and informed decision-making process based on real-time data, a shift toward a more efficient, analytics-driven, and user-friendly approach is essential [12]. The emergence of Crop and Fertilizer Recommendation Systems (CFRSs), which utilize technologies like the Internet of Things (IoT), data analytics, and artificial intelligence (AI) to give farmers personalized recommendations, is specifically responsible for this change [13,14]. A system like this has the potential to improve the interactions between soil and fertilizers, increase agricultural production, and encourage sustainable practices. Furthermore, CFRSs can reduce uncertainty, provide farmers with insightful information, and reduce the dangers associated with conventional farming practices.

ML applications [15] have become a part of our lives in a variety of fields in recent years, including urbanization, education [16], health, and defense industries [17]. They have also been proven useful in decision-making scenarios. They generate knowledge and technological solutions at the same time, thus serving as the foundation for recently developed search engine infrastructure, like Gemini 1.5, ChatGPT-4 (Chat Generative Pretrained Transformer-4) from company OpenAI (San Francisco, CA, USA), and other AI-based chatbots, among other tools [18]. Numerous research firms indicate that emerging trends will continue to expand across a range of platforms. In this regard, the impact of ML-based systems and solutions in the technology sector will significantly boost its efficacy, and using ML models will alter numerous industries, including traffic predictions [19] and chip design [14].

In this study, we comprehensively evaluated and compared the performance of various LLMs, ML, and DL models for crop recommendation systems. We proved through controlled experiments on structured tabular data that, while ML and DL models achieve ideal predicted accuracy, their frameworks are unsuitable for interactive applications. To fill this gap, we utilized a novel technique that converts structured sensor-based numerical data into descriptive textual inputs, allowing the fine-tuning of an LLM (GPT-2) for chatbot-based consulting systems. The fine-tuned GPT-2 model not only matched traditional models’ performance (99.55% accuracy, 1.00 AUC), but it also introduced natural language interaction functionality. The model is crucial for real-time, user-friendly agricultural decision aid, despite the extended training length (3255 s). The following are the contributions of this paper:

It investigates the novel application of LLMs for conversational crop recommendation, addressing a vital yet understudied area in interactive agricultural decision-making.
It presents a comprehensive evaluation of traditional models with advanced LLM-based techniques, highlighting their strengths and limitations in crop recommendation tasks.
It contributes to a mechanism for data transformation from tabular sensor data into textual descriptions, allowing for effective LLM fine-tuning and broadening their usefulness in structured-data domains.

The rest of this paper is divided as follows: A review of the relevant literature is provided in Section 2. The methodology of the novel application of LLMs in the crop recommendation domain is proposed in Section 3. In Section 4, the experimental setup is discussed in detail. The experimental findings and discussion are provided in Section 5. Finally, the conclusion and recommendations for the future are covered in Section 6.

2. Related Work

To increase crop quality and guarantee farmer profit, crop recommendation techniques utilize ML techniques and algorithms, thus benefiting the economy as a whole from the improvement in the quality of the agricultural sector. The literature in this section has covered this topic in great detail.

2.1. Machine Learning (ML)

The study [20] focused on developing a crop recommendation system, which promotes crop types using various ML and DL algorithms based on a number of parameters. Seven ML techniques were applied during this experimentation, and out of those seven algorithms, Naïve Bayes (NB) and XGBoost achieved the highest accuracy of 99.55%. The authors in [21] applied ML techniques, including Random Forest (RF), Support Vector Machine (SVM), Gradient Descent, Long Short-Term Memory (LSTM), and Lasso Regression, to predict crop yield for five crops in Rajasthan, India. The RF algorithm outperformed others with an R² of 0.963, RMSE of 0.035, and MAE of 0.0251.

A crop recommendation system was proposed in [22] using an Indian dataset. Different ML techniques were used in this experiment, such as KNN, an artificial neural network (ANN), RF, and SVM. The RF model achieved a higher accuracy of 99.22% as compared to SVM and KNN, achieving 97.85% and 97.95%, respectively. The authors in the study [23] evaluated the effectiveness of five various ML models on a crop dataset collected from Kaggle and the Indian Chamber of Food and Agriculture (ICFA). SVM, XGBoost, RF, KNN, and decision tree (DT) were trained using yields of individual datasets. XGBoost achieved higher accuracies of 99.09%, 99.30%, and 98.51% on agriculture, horticultural, and mixed crop datasets among all classifiers. Sundaresan et al. [24] proposed an IoT- and ML-based system integrating crop selection, autonomous watering, and fertilizer recommendations for crops like apple, rice, maize, and coffee. Using algorithms such as KNN, DT, RF, Gradient Boost, and XGBoost, the RF classifier achieved the highest accuracy of 99.77%.

In [25] authors proposed an ensemble ML model for optimal crop recommendation based on soil and environmental data. Using KNN, RF, Gaussian NB, Logistic Regression, and SVM as base learners, the model combined predictions through Majority Voting, achieving the highest accuracy of 99.6%. A comprehensive overview of recent ML applications in agriculture was provided by the authors of [26] to address challenges in three areas: pre-harvesting, harvesting, and post-harvesting. According to them, the use of ML applications in agriculture enables more efficient and precise farming with less human labor and higher quality production. This entire section’s discussion is summed up in Table 1.

2.2. Deep Learning (DL)

Currently, a good amount of work is being conducted to predict the crop yield utilizing DL models instead of ML alone. Gong et al. [27] developed a greenhouse crop yield prediction method combining Temporal Convolutional Networks (TCNs) and RNNs. Evaluated using datasets from real greenhouse tomato farms, the method achieved lower RMSEs compared to traditional ML and deep neural network models. The authors of [28] applied DL techniques for crop classification in smart farming, using datasets of fruits (grapes, apples, citrus, tomatoes) and vegetables (sugarcane, soybean, corn, cucumber, maize, wheat). Convolutional models achieved an average accuracy of 92.51%, while BPNN classifiers reached 99.31%.

The authors of [29] evaluated weeds using the U-Net-MobileNetV2 architecture for an accurate evaluation while minimizing computational costs and inference time. The proposed model achieved 96% accuracy with a mean intersection of 0.851. Moreover, the Jetson Nano platform was established using this model with its real-time validation. A DL-based CNN for classifying wheat leaf diseases was proposed by Khan et al. [30]. Healthy, septoria, and stripe rust were the three classes with a total of 407 wheat leaf images implemented using the CNN. With fewer images, the authors used a technique named data augmentation to increase the dataset images. Results showed that the CNN achieved an accuracy of 98.77% with the potential impact of preventing crop loss and increasing crop yield.

A mobile application for identifying plant leaf diseases and offering solutions was developed in [31] using a CNN model combined with Inception V3, which was trained on a dataset of 80,848 images covering 21 plant leaves and 60 classes. The proposed model achieved 99% accuracy. To prevent crop failure, early disease detection in plant leaves can help farmers. DL plays a major role in this. Islam et al. [32] experimented with different DL classifiers such as VGG-16, VGG-19, and ResNet-50 on a plant village dataset with a total of 10,000 images to detect crop infection. ResNet50 achieved the highest accuracy of 98.98%, with 98.60% accuracy achieved by VGG-16, the second-best model. Table 2 provides a summary of all the DL models discussed in this section.

2.3. Large Language Models (LLMs)

LLMs’ potential to explain scientific information and give tailored, location-specific, and data-driven agriculture recommendations gives them an edge over DL models. The study [33] highlighted the limitations of this strategy and provided technical advice to Nigerian cassava farmers based on real-world GPT testing. To enable the safe and ethical diffusion of LLM functionality throughout farming worldwide, the paper suggested an idealized LLM design process that includes human specialists. Yang et al. [34] highlight LLMs’ viability in the agricultural pest management area. They reveal a novel technique for the multidimensional assessment of pest management suggestions produced by GPT-4. Results show that instruction-based prompting with domain-specific information improves LLM-driven pest management accuracy by 72%. Kuska et al. [35] presented a map of easily accessible agriculture issues where LLM integration appears to be extremely plausible in the coming years. Chia et al. [36] focused on developing a strong pipeline architecture for training LLMs in real-world circumstances. This architecture protects data and exhibits extensive subject understanding. This strategy is based on Retrieval-Augmented Generation (RAG) technology, which sets up information into a streamlined structure for efficient retrieval. Unlike fine-tuning techniques, RAG separates basic knowledge from the model, resulting in a speedier and more adaptable pipeline. This concept converts obtained information into an accessible manner, allowing for quick access to extensive agricultural expertise. In the end, all LLMs discussed in this entire section are summarized in Table 3.

3. Methodology

In this section, we present the methodological framework of the LLMs. Here we also talk about the steps taken for modifying sensor data into descriptive textual form, whose context is more easily understandable by humans and LLMs. A visual representation of the overall system model is provided in Figure 1.

3.1. Large Language Models (LLMs)

LLMs are deep learning models that are trained to comprehend, create, and reason in human language. LLMs, which are based mostly on the transformer architecture, use mechanisms such as self-attention to process and model associations between words throughout extended text sequences. These models are self-trained on massive text corpora, learning statistical patterns, semantics, syntax, and even domain expertise. LLMs generally include hundreds of millions to billions of parameters, allowing them to perform a variety of natural language processing (NLP) tasks such as text classification, question answering, summarization, and conversation production. Recent examples including GPT-3.5, GPT-4, and GPT-4o have considerably expanded the field of natural language interpretation.

However, we focused on comparing classical machine learning models to earlier generations of LLMs in this study. To guarantee consistency and transparency in the examination, we chose an OpenAI-created, unsupervised, transformer-based LLM model known as GPT-2, which is built on the transformer architecture, a decoder-only structure that makes use of self-attention processes [37]. To identify contextual connections in text, GPT-2 only uses attention mechanisms, in contrast to recurrent neural networks (RNNs) and convolutional neural networks (CNNs). Text generation, summarization, and question answering are among the NLP activities for which it is intended.

As seen in the left panel of Figure 2, the GPT-2 LLM has N transformer decoder blocks. A multi-head masked attention layer, a multi-layer perceptron (MLP) layer, normalization, and dropout layers are all included in each decoder block (right panel). The block can learn from the input of the previous block thanks to the residual connection, which is the branching line to the addition operator [38].

The layer normalization comes before the masked multi-head component in GPT-2, which employs 50,257 byte pair encoding (BPE) tokens. The last block is followed by another Layer Norm. The 512 maximum sequence length is raised to 1024. Pre-training increases the mini-batch size from 64 to 512. There are four pre-trained GPT-2 featuring varying numbers of decoder blocks. With

d_{model}

= 1600 and 48 blocks, the largest one has 1.5 billion model parameters overall. Additionally, GPT-2’s training dataset differs from GPT’s. GPT-2’s training data contains over ten billion words, produced by collecting outbound Reddit links with more than three karma.

Multi-Head Attention for Transformer

Transformer models use a crucial process called multi-head attention, which enables the model to concentrate on multiple input segments at once. The right panel in Figure 2 depicts the transformer model structure that was suggested in [39].

Presented below is the mathematical formulation:

MultiHead (Q, K, V) = Concat ({head}_{1}, {head}_{2}, . . ., {head}_{h}) W^{O},

(1)

{head}_{i} = Attention (Q W_{i}^{Q}, K W_{i}^{K}, V W_{i}^{V}),

(2)

Attention (Q, K, V) = softmax (\frac{Q K^{T}}{\sqrt{d_{k}}}) V .

(3)

The multi-head attention mechanism, which involves concatenating several attention heads that are computed concurrently, is defined by Equation (1). The weight matrix

W^{O}

is used for the final projection. Each distinct attention head is represented by Equation (2), where input queries (Q), keys (K), and values (V) are subjected to various learnable weight matrices,

W^{Q}

,

W^{K}

, and

W^{V}

. Equation (3) explains scaled dot-product attention, in which the dot product of the queries and keys is used to calculate attention scores. To stabilize gradients, the dot product is scaled by

\sqrt{d_{k}}

. The final weighted sum of the data is produced by normalizing these scores using the softmax method. By learning several attention patterns simultaneously, multi-head attention enhances the model’s capacity to identify intricate relationships in the input.

Multi-head attention and masked multi-head attention differ in that the former enables the model to perceive the future context, but the latter does not. As a result, they are utilized in the encoder and decoder structures, respectively. A ReLU function sits between the two fully-connected (FC) layers that make up the feed-forward component. Utilizing an FC layer with a softmax function, the output component transforms the output from the last transformer decoder block into probability distributions. Each input embedding has a positional encoding applied to it in order to incorporate the input sequence’s order information.

3.2. Conversion of Sensor Readings into Descriptive Crop Information

Although LLMs may interpret numerical values in parameter format, we purposely turned the data into contextual, human-readable sentences to provide context, which improves the LLM’s capacity to identify correlations between variables and leads to more accurate predictions. Soil macro-nutrient levels (potassium, phosphorus, and nitrogen) and environmental variables (temperature, humidity, pH, and rainfall) make up the features of the numerical crop recommendation dataset.

We utilized a rule-based function to accomplish this translation, turning every row of numeric numbers into a human-readable sentence in plain language as can be seen in Table 4. This change not only makes it possible for conventional text-based models, such as transformers and RNNs, to process agricultural data more efficiently, but it also makes it possible to use LLMs that have already been trained and can refine them to gain domain-specific insights. Additionally, the process of turning numerical data into text form makes it easier to explain and interpret by enabling human-like explanations of the soil and environmental conditions. By utilizing this method, we want to enhance precision agricultural methods by combining linguistic traits with numerical patterns to increase crop recommendation accuracy.

4. Experimental Setup

4.1. Dataset

We used a crop recommendation dataset [40] containing attributes such as nitrogen (N), phosphorus (P), potassium (K), the pH value of soil, humidity, temperature, rainfall, and a classification label. It contains information about 2200 instances with seven attributes and one classification variable. The dataset has 22 crop classes such as rice, maize, jute, cotton, coconut, papaya, orange, apple, muskmelon, watermelon, grapes, mango, banana, pomegranate, lentil, blackgram, mungbean, mothbeans, pigeonpeas, kidneybeans, chickpea, and coffee. An overview of the crop recommendation dataset can be seen in Table 5.

4.2. Implementation Details

Both the training and testing of the model were conducted using Google Colab, leveraging its cloud-based computational resources. Several key python libraries such as pandas: 2.2.2, scikit-learn: 1.6.1, pyTorch: 2.7.1 + cu126, transformers: 4.52.4, and torchvision: 0.22.1 + cu126 were used to help with model construction, performance analysis, and data preprocessing. Furthermore, Google Colab’s T4 GPU significantly accelerated model inference and training, which made it the perfect platform for our experiments. The batch size and epoch were 8 and 50, respectively, with Adam as the training optimizer. During experimentation, at random, 80% and 20% of the dataset were separated into the training and test sets.

4.3. Evaluation Metrics

Accuracy, precision, recall, and F1-score were the four main evaluation measures used to assess the model’s performance. These metrics offer a thorough insight into the model’s ability to differentiate between various classes. Their mathematical formulas are as follows:

Accuracy = \frac{T P + T N}{T P + T N + F P + F N},

(4)

Precision = \frac{T P}{T P + F P},

(5)

Recall = \frac{T P}{T P + F N},

(6)

F 1 = 2 \times \frac{Precision \times Recall}{Precision + Recall} .

(7)

where correctly predicted positive samples are represented by

T P

(True Positives), and correctly predicted negative samples are represented by

T N

(True Negatives). False Positives (

F P

) are samples that are anticipated to be positive but are not, while negative samples that are not correctly predicted are known as False Negatives (

F N

).

4.4. Comparison with Traditional Models

To evaluate the performance of our model, we compared it with various LLMs and ML and DL models. The comparison was conducted with models that take either numerical data or textual data to analyze their effectiveness. We compared our model with several traditional ML models commonly used for classification and predictive tasks. The ML models used for comparison included KNN, RF, DT, SVM, NB, and MLP. We also compared our model with various DL architectures such as LSTM, Bidirectional LSTM (BI-LSTM), and CNN-LSTM. All these mentioned ML and DL models used a numerical dataset as input. Along with these models, we also compared it with the BERT LLM, which uses textual data as input.

5. Results and Discussions

In this section, the results of experiments on different models using the crop recommendation dataset are discussed in detail.

Comparison of Various Models

The findings presented in the Table 6 are used to compare the accuracy, precision, recall, F1-score, and computational time (training time) of the different model classifiers trained using numerical and textual data.

With an accuracy of 99.54%, the NB classifier was the most accurate ML model. NB also achieved precision, recall, and F1-score values of 99.58%, 99.55%, and 99.54%, respectively. Additionally, NB maintained an impressively low training time of just 0.01 s, making it the most computationally efficient model trained on numerical data. This exceptional combination of accuracy and speed makes NB particularly suitable for real-time applications where both high performance and low computational costs are critical.

Meanwhile, CNN-LSTM, a DL-based model trained on numerical data, achieved the best performance among other DL models, with an accuracy of 97.95%, a precision of 97.99%, a recall of 97.95%, and an F1 score of 97.95%. These metrics suggest that CNN-LSTM is highly effective in making accurate predictions and maintaining consistency across various evaluation criteria among DL models when using numerical data for training. Despite its complex architecture designed to capture sequential dependencies, CNN-LSTM remained computationally efficient as compared to BI-LSTM, with a training time of 77.75 s. On the other hand, it is more computationally expensive than LSTM, which is less accurate as compared to CNN-LSTM. This combination of high performance and efficiency makes CNN-LSTM a standout choice when using numerical datasets, which are predominantly feature-based. Nevertheless, CNN-LSTM was unable to outperform the NB ML model in performance on numerical data.

However, when it comes to training using textual data, LLMs take center stage with their ability to process textual data without issues. Among the LLMs used, GPT-2 outperformed BERT when trained on textual data. GPT-2 output 99.55% accuracy, 99.58% precision, 99.55% recall, and a 99.57% F1-score, while having the lowest training time of 3255.01 s compared to the BERT LLM model. The GPT-2 model was able to distinguish between distinct categories in the dataset, maybe as a result of its extremely complex architecture and thorough pre-training, which allowed it to identify subtle patterns when being trained on textual data.

Using precision–recall curves, ROC curves, and confusion matrices, the effectiveness of ML, DL, and LLMs for crop recommendation was thoroughly assessed (Figure 3, Figure 4, Figure 5 and Figure 6). For crop recommendation, both the RF and GPT-2 models performed exceptionally well, while having notably different operating features. Both exhibited outstanding classification abilities with flawless AUC ratings (1.00) across all 22 crop categories. With just two small misclassifications—one of rice being incorrectly identified as jute (5% error rate) and one of maize being incorrectly identified as blackgram—RF performed almost flawlessly. In the same way, GPT-2 upheld this high standard but exhibited distinct error patterns, misclassifying maize as blackgram and pigeonpeas as mango once each. For the majority of crops, both models obtained flawless AUPRC scores (1.00); however, RF performed marginally better for rice and jute classification, while GPT-2 performed slightly better for pigeonpeas and mango.

Overall, LLM-based models were able to outperform traditional ML and DL models, with GPT-2 performing best in most cases. BERT also demonstrated competitive performance, despite being trained on textual representations of structured data. However, this performance came at a cost of high training time, due to the substantial computational complexity of LLMs being trained on textual data in general.

6. Conclusions

This study highlights the potential of LLMs in advancing precision agriculture through intelligent crop recommendation systems. Our investigation shows that while traditional ML and DL models exhibit good prediction performance, they lack conversational functionality essential for practical agricultural advisory systems. To close this gap, we devised a novel method for converting structured agricultural data into natural language inputs, allowing for the effective fine-tuning of the GPT-2 based LLM model. The GPT-2 demonstrated remarkable performance (99.55% accuracy, 1.00 AUC), suggesting that LLMs can not only match traditional models in recommendation accuracy but also provide the critical advantage of natural language interaction.

The development of LLMs, especially GPT-2, highlights exciting possibilities which can be used for developing intelligent, context-aware chatbots that can assist farmers in the future.

Author Contributions

S.D.K. and M.A.B. conceptualized the study and created the methodology. A.J.K. additionally performed formal analysis and software implementation. A.J.K. was in charge of the inquiry and data curation, while A.J.K., S.D.K. and M.A.B. were in charge of validation. S.D.K. and M.A.B. contributed the resources, while A.J.K. finished the visualization. A.J.K. created the manuscript’s initial draft, and S.D.K., M.H.Z., M.U., H.U. and M.A.B. carried out its review and editing. S.D.K., M.A.B., H.U. were in charge of supervision and project management, while S.D.K. was in charge of obtaining finance. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Research Program for Universities (NRPU) of the Higher Education Commission (HEC) of Pakistan (20-17332/NRPU/R&D/HEC/2021-2020).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

We have only used publicly available dataset from the Kaggle platform. We did not create any new dataset.

Acknowledgments

We gratefully acknowledge the Norwegian University of Science and Technology (NTNU), Norway, for covering the article processing charges (APC) through its Open Access Publishing Fund.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Pawlak, K.; Kołodziejczak, M. The role of agriculture in ensuring food security in developing countries: Considerations in the context of the problem of sustainable food production. Sustainability 2020, 12, 5488. [Google Scholar] [CrossRef]
Aslam, S.; Bakr, M.A.; Khan, S.D. Segmentation of Agricultural Fields in Aerial Imagery Using Enhanced Deep Learning Models. In Proceedings of the 2024 19th International Conference on Emerging Technologies (ICET), Topi, Pakistan, 19–20 November 2024; pp. 1–6. [Google Scholar]
Tasfe, M.; Nivrito, A.; Al Machot, F.; Ullah, M.; Ullah, H. Deep Learning Based Models for Paddy Disease Identification and Classification: A Systematic Survey. IEEE Access 2024, 12, 100862–100891. [Google Scholar] [CrossRef]
Khan, N.; Ray, R.L.; Sargani, G.R.; Ihtisham, M.; Khayyam, M.; Ismail, S. Current progress and future prospects of agriculture technology: Gateway to sustainable agriculture. Sustainability 2021, 13, 4883. [Google Scholar] [CrossRef]
Chen, Y.; Kuang, J.; Cheng, D.; Zheng, J.; Gao, M.; Zhou, A. AgriKG: An agricultural knowledge graph and its applications. In Proceedings of the Database Systems for Advanced Applications: DASFAA 2019 International Workshops: BDMS, BDQM, and GDMA, Chiang Mai, Thailand, 22–25 April 2019; Proceedings 24. Springer: Berlin/Heidelberg, Germany, 2019; pp. 533–537. [Google Scholar]
Khan, A.J.; Bakr, M.A.; Khan, S.D. Vegetation Detection Using UAV Imagery with Deep Learning Segmentation Models. In Proceedings of the 2024 19th International Conference on Emerging Technologies (ICET), Topi, Pakistan, 19–20 November 2024; pp. 1–6. [Google Scholar]
Tantalaki, N.; Souravlas, S.; Roumeliotis, M. Data-driven decision making in precision agriculture: The rise of big data in agricultural systems. J. Agric. Food Inf. 2019, 20, 344–380. [Google Scholar] [CrossRef]
Sharma, A.; Jain, A.; Gupta, P.; Chowdary, V. Machine learning applications for precision agriculture: A comprehensive review. IEEE Access 2020, 9, 4843–4873. [Google Scholar] [CrossRef]
Heyden, W.; Ullah, H.; Siddiqui, M.S.; Al-Machot, F. An integral projection-based semantic autoencoder for zero-shot learning. IEEE Access 2023, 11, 85351–85360. [Google Scholar] [CrossRef]
Bucci, G.; Bentivoglio, D.; Finco, A. Precision agriculture as a driver for sustainable farming systems: State of art in literature and research. Calitatea 2018, 19, 114–121. [Google Scholar]
Musanase, C.; Vodacek, A.; Hanyurwimfura, D.; Uwitonze, A.; Kabandana, I. Data-driven analysis and machine learning-based crop and fertilizer recommendation system for revolutionizing farming practices. Agriculture 2023, 13, 2141. [Google Scholar] [CrossRef]
Akhter, R.; Sofi, S.A. Precision agriculture using IoT data analytics and machine learning. J. King Saud-Univ.-Comput. Inf. Sci. 2022, 34, 5602–5618. [Google Scholar] [CrossRef]
Hossain, M.; Siddique, M. Online Fertilizer Recommendation System (OFRS): A step towards precision agriculture and optimized fertilizer usage by smallholder farmers in Bangladesh: Online fertilizer recommendation. Eur. J. Environ. Earth Sci. 2020, 1, e47. [Google Scholar] [CrossRef]
Bhat, S.A.; Huang, N.F. Big data and ai revolution in precision agriculture: Survey and challenges. IEEE Access 2021, 9, 110209–110222. [Google Scholar] [CrossRef]
Khan, S.D.; Ullah, R.; Rahim, M.A.; Rashid, M.; Ali, Z.; Ullah, M.; Ullah, H. An efficient deep learning framework for face mask detection in complex scenes. In Proceedings of the IFIP International Conference on Artificial Intelligence Applications and Innovations, Corfu, Greece, 27–30 June 2022; Springer: Cham, Switzerland, 2022; pp. 159–169. [Google Scholar]
Ullah, M.; Amin, S.U.; Munsif, M.; Yamin, M.M.; Safaev, U.; Khan, H.; Khan, S.; Ullah, H. Serious games in science education: A systematic literature. Virtual Real. Intell. Hardw. 2022, 4, 189–209. [Google Scholar] [CrossRef]
Ullah, H.; Uzair, M.; Jan, Z.; Ullah, M. Integrating industry 4.0 technologies in defense manufacturing: Challenges, solutions, and potential opportunities. Array 2024, 23, 100358. [Google Scholar] [CrossRef]
Li, J.; Xu, M.; Xiang, L.; Chen, D.; Zhuang, W.; Yin, X.; Li, Z. Large language models and foundation models in smart agriculture: Basics, opportunities, and challenges. arXiv 2023, arXiv:2308.06668. [Google Scholar]
Singh, R.K.; Berkvens, R.; Weyn, M. AgriFusion: An architecture for IoT and emerging technologies based on a precision agriculture survey. IEEE Access 2021, 9, 136253–136283. [Google Scholar] [CrossRef]
Sharma, P.; Dadheech, P.; Senthil, A.S.K. AI-Enabled Crop Recommendation System Based on Soil and Weather Patterns. In Artificial Intelligence Tools and Technologies for Smart Farming and Agriculture Practices; IGI Global: New York, NY, USA, 2023; pp. 184–199. [Google Scholar]
Jhajharia, K.; Mathur, P.; Jain, S.; Nijhawan, S. Crop yield prediction using machine learning and deep learning techniques. Procedia Comput. Sci. 2023, 218, 406–417. [Google Scholar] [CrossRef]
Mohapatra, B.N.; Kale, V. Crop recommendation system using Machine Learning. ITEGAM-JETIA 2024, 10, 63–68. [Google Scholar] [CrossRef]
Dey, B.; Ferdous, J.; Ahmed, R. Machine learning based recommendation of agricultural and horticultural crop farming in India under the regime of NPK, soil pH and three climatic variables. Heliyon 2024, 10, e25112. [Google Scholar] [CrossRef] [PubMed]
Sundaresan, S.; Johnson, S.D.; Bharathy, V.M.; Kumar, P.M.P.; Surendar, M. Machine learning and IoT-based smart farming for enhancing the crop yield. In Proceedings of the Journal of Physics: Conference Series; IOP Publishing: Bristol, UK, 2023; Volume 2466, p. 012028. [Google Scholar]
Sandhya, K.G.; Vemuri, S.; Deeksha, K.S.; Anvitha, T. Crop recommendation system using ensembling technique. In Proceedings of the 2022 International Conference on Breakthrough in Heuristics And Reciprocation of Advanced Technologies (BHARAT), Visakhapatnam, India, 7–8 April 2022; pp. 55–58. [Google Scholar]
Meshram, V.; Patil, K.; Meshram, V.; Hanchate, D.; Ramkteke, S. Machine learning in agriculture domain: A state-of-art survey. Artif. Intell. Life Sci. 2021, 1, 100010. [Google Scholar] [CrossRef]
Gong, L.; Yu, M.; Jiang, S.; Cutsuridis, V.; Pearson, S. Deep learning based prediction on greenhouse crop yield combined TCN and RNN. Sensors 2021, 21, 4537. [Google Scholar] [CrossRef] [PubMed]
Darwin, B.; Dharmaraj, P.; Prince, S.; Popescu, D.E.; Hemanth, D.J. Recognition of bloom/yield in crop images using deep learning models for smart agriculture: A review. Agronomy 2021, 11, 646. [Google Scholar] [CrossRef]
Qureshi, M.F.; Amin, F.; Mushtaq, Z.; Ali, M.; Haris, A.A.; Rana, A.Y. Real-Time Weed Segmentation in Tobacco Crops Utilizing Deep Learning on a Jetson Nano. In Proceedings of the 2024 International Conference on Engineering & Computing Technologies (ICECT), Islamabad, Pakistan, 23 May 2024; pp. 1–6. [Google Scholar]
Khan, A.A.; Raza, S.; Qureshi, M.F.; Mushtaq, Z.; Taha, M.; Amin, F. Deep learning-based classification of wheat leaf diseases for edge devices. In Proceedings of the 2023 2nd International Conference on Emerging Trends in Electrical, Control, and Telecommunication Engineering (ETECTE), Lahore, Pakistan, 27–29 November 2023; pp. 1–6. [Google Scholar]
Bhatti, U.A.; Bazai, S.U.; Hussain, S.; Fakhar, S.; Ku, C.S.; Marjan, S.; Yee, P.L.; Jing, L. Deep Learning-Based Trees Disease Recognition and Classification Using Hyperspectral Data. Comput. Mater. Contin. 2023, 77, 681. [Google Scholar] [CrossRef]
Islam, M.M.; Adil, M.A.A.; Talukder, M.A.; Ahamed, M.K.U.; Uddin, M.A.; Hasan, M.K.; Sharmin, S.; Rahman, M.M.; Debnath, S.K. DeepCrop: Deep learning-based crop disease prediction with web application. J. Agric. Food Res. 2023, 14, 100764. [Google Scholar] [CrossRef]
Tzachor, A.; Devare, M.; Richards, C.; Pypers, P.; Ghosh, A.; Koo, J.; Johal, S.; King, B. Large language models and agricultural extension services. Nat. Food 2023, 4, 941–948. [Google Scholar] [CrossRef] [PubMed]
Yang, S.; Yuan, Z.; Li, S.; Peng, R.; Liu, K.; Yang, P. GPT-4 as Evaluator: Evaluating Large Language Models on Pest Management in Agriculture. arXiv 2024, arXiv:2403.11858. [Google Scholar] [CrossRef]
Kuska, M.T.; Wahabzada, M.; Paulus, S. AI for crop production–Where can large language models (LLMs) provide substantial value? Comput. Electron. Agric. 2024, 221, 108924. [Google Scholar] [CrossRef]
Chia, H.; Oliveira, A.I.; Azevedo, P. Implementation of an intelligent virtual assistant based on LLM models for irrigation optimization. In Proceedings of the 2024 8th International Young Engineers Forum on Electrical and Computer Engineering (YEF-ECE), Lisbon, Portugal, 5 July 2024; pp. 94–100. [Google Scholar]
Shaikh, T.A.; Rasool, T.; Veningston, K.; Yaseen, S.M. The role of large language models in agriculture: Harvesting the future with LLM intelligence. Prog. Artif. Intell. 2024, 14, 117–164. [Google Scholar] [CrossRef]
Yang, S.D.; Ali, Z.A.; Wong, B.M. Fluid-gpt (fast learning to understand and investigate dynamics with a generative pre-trained transformer): Efficient predictions of particle trajectories and erosion. Ind. Eng. Chem. Res. 2023, 62, 15278–15289. [Google Scholar] [CrossRef] [PubMed]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30. Available online: https://papers.nips.cc/paper_files/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html (accessed on 4 November 2024).
Ingle, A. Crop Recommendation Dataset. 2020. Available online: https://www.kaggle.com/datasets/atharvaingle/crop-recommendation-dataset (accessed on 4 November 2024).

Figure 1. Architecture of the proposed LLM-based system.

Figure 2. An overview of the baseline GPT-2 model architecture.

Figure 3. Visual comparison of various ML models (KNN, RF, DT) on the crop recommendation dataset using confusion matrices, precision–recall curves, and ROC curves. (Zoom in for better visibility.)

Figure 4. Visual comparison of various ML models (SVM, NB, MLP) on the crop recommendation dataset using confusion matrices, precision–recall curves, and ROC curves. (Zoom in for better visibility.)

Figure 5. Visual comparison of various DL models (LSTM, BiLSTM, CNN-LSTM) on the crop recommendation dataset using confusion matrices, precision–recall curves, and ROC curves. (Zoom in for better visibility.)

Figure 6. Visual comparison of LLMs (GPT-2 and BERT) on the crop recommendation dataset using confusion matrices, precision–recall curves, and ROC curves. (Zoom in for better visibility.)

Table 1. Summary of research contributions in the domain of ML.

Author et al.	Year	Dataset	Method Used	Advantages	Limitations
Sharma et al. [20]	2023	Soil and weather pattern datasets (region-specific)	AI-enabled recommendation system	Integrates soil and weather data for precise recommendations; scalable for smart farming applications.	Dependency on accurate soil and weather datasets; computationally intensive for real-time applications.
Jhajharia et al. [21]	2023	Crop yield data (FAO, ICAR)	ML and DL techniques	Combines temporal and spatial data for improved yield prediction.	Computationally expensive; complex training process.
Mohapatra et al. [22]	2024	State-specific crop datasets	Decision tree, Random Forest	Easy implementation and high interpretability.	Not suitable for highly complex datasets; limited generalization.
Dey et al. [23]	2024	Indian soil and climate data	ML-based multi-variable analysis	Tailored for specific Indian soil and climatic conditions.	Poor generalization for other regions.
Sundaresan et al. [24]	2023	IoT crop datasets	ML + IoT integration	Real-time monitoring using IoT sensors.	High setup costs for IoT devices; maintenance complexity.
Sandhya et al. [25]	2022	Open-source agricultural datasets	Ensemble techniques	Enhances accuracy through ensemble learning.	Computationally demanding; reduces model interpretability.
Meshram et al. [26]	2021	Review paper (various ML datasets for agriculture)	Survey on ML techniques	Comprehensive overview of ML applications in agriculture; highlights best practices and challenges.	Does not propose or evaluate specific implementations; limited focus on datasets or case studies.

Table 2. Summary of research contributions in the domain of DL.

Author et al.	Year	Dataset	Method Used	Advantages	Limitations
Gong et al. [27]	2021	Greenhouse yield datasets	TCN + RNN	High accuracy in time series prediction.	High model complexity; requires extensive computational resources.
Darwin et al. [28]	2021	Stress crop image datasets	CNN-based bloom recognition	Detects stress affecting crop yields.	Limited by dataset diversity; prone to overfitting.
Qureshi et al. [29]	2024	Tobacco crop image dataset	DL for real-time weed segmentation on Jetson Nano	Real-time processing; low power consumption; suitable for edge devices.	Limited to specific hardware (Jetson Nano); potential challenges with generalization to other crop types.
Khan et al. [30]	2023	Wheat leaf disease datasets	DL-based disease classification	High precision for wheat leaf disease classification.	Requires high-quality labeled data; resource-intensive for edge devices.
Bhatti et al. [31]	2023	Hyperspectral tree datasets	DL with hyperspectral data	Effective disease detection with hyperspectral imaging.	Requires specialized equipment for hyperspectral imaging.
Islam et al. [32]	2023	Agricultural disease datasets	DeepCrop with web app	Integrates disease prediction with user-friendly web interface.	High dependency on the internet and cloud infrastructure.

Table 3. Summary of research contributions in the domain of LLMs.

Author et al.	Year	Dataset	Method Used	Advantages	Limitations
Tzachor et al. [33]	2023	Agricultural extension datasets	LLMs for agricultural extension services	Supports diverse applications; adaptable to multiple tasks.	Limited domain-specific knowledge compared to focused models.
Yang et al. [34]	2024	Pest datasets (simulated cases)	GPT-4 for evaluation	Evaluates effectiveness of pest management strategies; offers rich insights.	Costly to fine-tune and deploy for specific agricultural contexts.
Kuska et al. [35]	2024	General crop data	LLM integration in precision farming	Broad utility in crop-related decision-making; facilitates intelligent insights.	Limited by current capabilities of LLMs in agricultural datasets.
Chia et al. [36]	2024	Irrigation data (simulated)	Virtual assistant using LLM	Improves irrigation efficiency and reduces water usage.	Dependent on the quality of training data; costly to maintain and update.

Table 4. Comparison between numerical data and textual form for crop recommendation.

Numerical Data	Textual Data
N: 79, P: 51, K: 16 Temperature: 25.34 °C Humidity: 68.50% pH: 6.59 Rainfall: 96.46 mm	“The soil has a nitrogen level of 79, phosphorus level of 51, and potassium level of 16. The temperature is 25.34 °C, humidity is 68.50%, pH is 6.59, and rainfall is 96.46 mm.”

Table 5. Overview of crop recommendation dataset.

N	P	K	Temperature	Humidity	pH	Rainfall	Label
79	51	16	25.34	68.50	6.59	96.46	maize
22	55	20	33.95	69.96	7.42	61.16	blackgram
21	39	20	27.06	52.30	7.39	60.75	mothbeans
9	8	40	22.49	89.92	6.55	111.66	pomegranate
12	66	20	27.41	63.42	7.34	44.43	lentil
20	72	15	36.00	56.01	7.31	134.86	pigeonpeas
39	24	14	30.55	90.90	7.19	106.07	orange
21	139	201	19.36	83.36	5.98	67.15	grapes
24	80	19	29.68	69.09	6.81	65.66	blackgram
95	30	52	29.48	90.34	6.64	26.04	muskmelon
32	13	42	23.50	92.98	5.79	106.62	pomegranate
32	41	16	28.64	61.39	7.70	68.55	mothbeans

Table 6. Performance comparison of various classifiers on crop recommendation dataset.

Classifiers	Data Type	Accuracy	Precision	Recall	F1-Score	Approx. Training Time
KNN	Numerical	95.68%	96.29%	95.68%	95.67%	0.01 s
RF	Numerical	99.31%	99.37%	99.31%	99.31%	0.51 s
DT	Numerical	98.64%	98.68%	98.64%	98.63%	0.02 s
SVM	Numerical	96.82%	97.15%	96.82%	96.80%	0.51 s
NB	Numerical	99.54%	99.58%	99.55%	99.54%	0.01 s
MLP	Numerical	97.50%	97.95%	97.50%	97.54%	10.45 s
LSTM	Numerical	97.27%	97.66%	97.27%	97.32%	51.97 s
BI-LSTM	Numerical	97.05%	97.37%	97.05%	97.10%	85.80 s
CNN-LSTM	Numerical	97.95%	97.99%	97.95%	97.95%	77.75 s
BERT	Textual	98.18%	98.27%	98.18%	98.18%	2937.45 s
GPT-2	Textual	99.55%	99.58%	99.55%	99.57%	3255.01 s

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Bakr, M.A.; Khan, A.J.; Khan, S.D.; Zafar, M.H.; Ullah, M.; Ullah, H. Evaluation of Learning-Based Models for Crop Recommendation in Smart Agriculture. Information 2025, 16, 632. https://doi.org/10.3390/info16080632

AMA Style

Bakr MA, Khan AJ, Khan SD, Zafar MH, Ullah M, Ullah H. Evaluation of Learning-Based Models for Crop Recommendation in Smart Agriculture. Information. 2025; 16(8):632. https://doi.org/10.3390/info16080632

Chicago/Turabian Style

Bakr, Muhammad Abu, Ahmad Jaffar Khan, Sultan Daud Khan, Mohammad Haseeb Zafar, Mohib Ullah, and Habib Ullah. 2025. "Evaluation of Learning-Based Models for Crop Recommendation in Smart Agriculture" Information 16, no. 8: 632. https://doi.org/10.3390/info16080632

APA Style

Bakr, M. A., Khan, A. J., Khan, S. D., Zafar, M. H., Ullah, M., & Ullah, H. (2025). Evaluation of Learning-Based Models for Crop Recommendation in Smart Agriculture. Information, 16(8), 632. https://doi.org/10.3390/info16080632

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Evaluation of Learning-Based Models for Crop Recommendation in Smart Agriculture

Abstract

1. Introduction

2. Related Work

2.1. Machine Learning (ML)

2.2. Deep Learning (DL)

2.3. Large Language Models (LLMs)

3. Methodology

3.1. Large Language Models (LLMs)

Multi-Head Attention for Transformer

3.2. Conversion of Sensor Readings into Descriptive Crop Information

4. Experimental Setup

4.1. Dataset

4.2. Implementation Details

4.3. Evaluation Metrics

4.4. Comparison with Traditional Models

5. Results and Discussions

Comparison of Various Models

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI