Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Automatic Classification of Agricultural Crops Using Sentinel-2 Data in the Rainfed Zone of Southern Kazakhstan

Agronomy 2025, 15(9), 2040; https://doi.org/10.3390/agronomy15092040

by Asset Arystanov¹

, Janay Sagin^2,3,*

, Natalya Karabkina⁴

, Ranida Arystanova¹

, Farabi Yermekov⁴

, Gulnara Kabzhanova⁵

, Roza Bekseitova¹

, Aliya Aktymbayeva¹

and Nuray Kutymova¹

Reviewer 1:

Junfeng Xiong

Reviewer 2: Anonymous

Reviewer 3:

Mirzaee Salman

Agronomy 2025, 15(9), 2040; https://doi.org/10.3390/agronomy15092040

Submission received: 22 July 2025 / Revised: 18 August 2025 / Accepted: 21 August 2025 / Published: 25 August 2025

(This article belongs to the Special Issue Application of Deep and Machine Learning in Crop Monitoring and Management)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

General Comments This work addresses a very practical and important problem: how to effectively monitor crop types across the vast rainfed agricultural areas of southern Kazakhstan. The comparative analysis of two classification methods with different levels of automation, along with the thorough validation using multi-source data, forms a solid foundation for this study. However, the manuscript has some fundamental issues related to its research focus, methodological description, and language clarity.
Specific Revision Suggestions (1) Refocus the Research (Title, Abstract, and Introduction): This is the most critical issue. The title, abstract, and parts of the introduction all point to 'crop soil monitoring,' but the actual research content (methods, results, discussion) is entirely about 'crop type classification.' You do not monitor any soil parameters. This discrepancy between the stated topic and the content creates a serious misalignment. (2) Accurately Describe the Methodology Used: The manuscript prominently features 'Machine Learning (ML)' in the title and abstract. However, the classification technique described in the methods section (2.4) is a 'rule-based classification' or 'decision tree method' based on specific thresholds and logical conditions (the Con(...) function) set for NDVI and PLI. This is fundamentally an expert knowledge-based system, which is different by definition from data-driven machine learning algorithms (such as Random Forest, SVM, neural networks, etc.). If you wish to retain the 'Machine Learning' theme, you would need to actually implement an ML classifier (e.g., RF, SVM) and potentially compare its performance to your existing rule-based method. 3. Detailed Review of the Manuscript (1) Provide Details on the Plowed Land Index (PLI): The manuscript mentions the PLI and cites references [22, 23], but it does not briefly explain the principle behind its calculation. Please add a brief description in the methods section explaining how the PLI identifies tillage activities based on changes in soil brightness or the SWIR bands. (2) Universality of Classification Rules: Section 2.4 lists detailed classification rules and NDVI thresholds. Are these thresholds universal, or were they adjusted for the different agro-climatic zones shown in Figure 3? For example, given that crop phenology differs between the plain and foothill areas (as shown in Figure 8), did the classification rules account for this? Please clarify this in the methods section.

Author Response

Comments 1: Reorientation of the study (title, abstract, and introduction): This is the most important issue. The title, abstract, and parts of the introduction refer to "monitoring of crop soils", but the actual content of the study (methods, results, discussion) is entirely devoted to "classification of crop types". You don't track any soil parameters. This discrepancy between the stated topic and content creates a serious discrepancy.

Responses 1: Thank you very much for your time with you for reviewing our manuscript. Your comments are very valuable. Yes, we agree that the original title, abstract, and content of the article have discrepancies. We updated our title as " Automatic Classification of Agricultural Crops Using Sentinel-2 Data in the Rainfed Zone of Southern Kazakhstan" to make in line what we did. We also made appropriate changes to the abstract to reflect the subject and content of our manuscript. You are right that our study was without soil monitoring and machine learning methods applications. We worked on the agricultural crops’ classification activities. We apologize for the original misunderstanding.

Comments 2: Describe exactly the methodology used: The title and abstract of the manuscript clearly indicate "Machine Learning (MO)". However, the classification method described in Methods section (2.4) is a "rule-based classification" or "decision tree method" based on certain thresholds and logical conditions (Con(...) function) for NDVI and PLI. It is essentially a knowledge-based expert system, which by definition is different from data-driven machine learning algorithms (such as random forest, SVM, neural networks, etc.). If you want to keep the topic of "Machine Learning" alive, you need to implement an MO classifier (such as RF, SVM) and perhaps compare its performance with your existing rule-based method.

Responses 2: We fully agree with your analysis. We used the "rule-based classification" as you said, not a machine learning method. We made the related updates by changing the article title to reflect properly what we did. We removed the "machine learning" wordings in the article and emphasized what we did with the rule-based classification approach.

Comments 3: Detailed review of the manuscript (1) Provide detailed information about the Ploughed Land Index (PLI): The manuscript mentions PLI and cites references [22, 23], but the principle of its calculation is not explained briefly. Please add a brief description in the methods section explaining how the PLI identifies tillage activities based on changes in soil brightness or SWIR channels. (2) Universality of classification rules: See Section 2.4 for detailed rules NDVI classifications and thresholds. Are these thresholds universal or have they been adjusted for the different agroclimatic zones shown in Figure 3? For example, given that the phenology of crops differs between lowlands and foothills (as shown in Figure 8), did the classification rules take this into account? Please clarify this in the methods section.

Responses 3: Yes, worked on these issues with the calculations principles explanation improvements by adding the related description to the methods section (Section 2.4): Plowed Land Index (PLI) was derived from the brightness index of the Tasseled Cap transform. The formula for the index is the original linear combination of the values of the six spectral channels of Sentinel-2 satellite data: B2, B3, B4, B8, B11 and B12. When creating this index, each spectral channel was weighed and transformed using certain weighting factors. The signs in the formula were selected to optimize the detection of plowed land in the foothill and lowland zones of the Turkestan region. As a result, a new linear combination was obtained for detecting plowed land in the study region:

PLI=-0.0037×b2–0.1793×b3+0.5403×b4-0.5585×b8–0.1082×b11+0.3013×b12

(We added this text to Section 2.4)

The account differences in the phenology of crop growth in different agroclimatic zones, such as plains and foothills, were added. The classification rules are not universal, but were developed taking into account the phenological periods specific to each culture and each zone. Classification was carried out using composites (images) collected during certain periods of vegetation, as shown in our rules. For example, conditions like ("NDVI_25apr_composite" > "NDVI_25mar_composite") allow you to capture the beginning of active vegetation of the crop after sowing, and conditions like ("NDVI_30may_composite" < "NDVI_peak") they reflect the maturation phase, which is different for different zones. Thus, these rules have been carefully adapted to the time features of the growth of lowland and foothill crops, which ensures the accuracy of classification. The clarification to the methods section (Section 2.4) was added with emphasizing that the classification rules would be reasonable to improve by taking into account the phenological differences.

Reviewer 2 Report

Comments and Suggestions for Authors

The author proposes and compares two rainfed crop classification methods based on Sentinel-2 data, balancing both accuracy and efficiency, thereby providing effective monitoring tools and empirical support for sustainable agricultural management in Kazakhstan. However, several issues remain in the paper, including but not limited to the following:

（1）The title fails to effectively extract the core keywords, indicating that either the title or the keywords do not accurately reflect the main focus of the research. It is recommended that the author refine the title and keywords to better capture the essence of the study.

（2）The manuscript lacks a systematic review of existing research on crop classification using remote sensing (RS) and machine learning (ML) techniques. The author is advised to incorporate a more comprehensive literature review, explicitly identify major gaps in current studies, and clearly articulate the specific problems addressed in this paper. Emphasizing the study’s innovation and contributions will enhance its academic value and persuasiveness.

（3）The paper does not provide information on the sample size distribution of various crop types during key phenological stages, nor does it specify the weather and illumination conditions during the acquisition of remote sensing imagery. The absence of this information may limit the model's ability to capture full growth cycle characteristics, thereby affecting classification accuracy and generalizability. Supplementing this content would improve the study’s rigor and the reliability of its results.

（4）The title of the paper is “Application of Remote Sensing with Machine Learning for Crop Soil Monitoring in South Kazakhstan.” The author is encouraged to provide a detailed description of the machine learning algorithms used. Additionally, a comparative analysis with other commonly used ML algorithms or models in recent crop classification studies should be included, highlighting the advantages and limitations of the chosen methods.

（5）The description of the data preprocessing steps is insufficient. The author should elaborate on the specific preprocessing techniques and parameter settings used (e.g., data cleaning, normalization). A clear and detailed presentation of the preprocessing pipeline is crucial to ensure data quality, enhance model training stability, improve classification accuracy, and support transparency and reproducibility.

（6）It is recommended that the author explicitly state the study's limitations in the conclusion and suggest directions for future research to promote further advancement in this field.

（7）The author is also advised to expand the discussion section by analyzing the model's adaptability under different climatic conditions and exploring its scalability to other regions, thereby clarifying the applicability and generalizability of the approach. Furthermore, evaluating the model’s performance across various soil types would help verify its robustness and reliability.

Comments on the Quality of English Language

The manuscript is written in acceptable English. While there are some minor language and stylistic issues, they do not interfere with understanding the content. A light editing by a native English speaker is suggested to improve fluency.

Author Response

Comments 1: The title does not fully reflect the main keywords, which indicates that either the title or keywords do not accurately reflect the main research topic. The author is recommended to clarify the title and keywords to better reflect the essence of the research.

Comments 2: The manuscript does not provide a systematic review of existing research on crop classification using remote sensing (DZ) and machine learning (MO) methods. The author is encouraged to include a more comprehensive review of the literature, clearly identify the main gaps in current research, and clearly state the specific issues addressed in this article. Emphasizing the innovation and relevance of research will increase its academic value and credibility.

Responses 2: We have updated our article based on your recommendations and provided more clarifications. We emphasized more on the rule-based classification, what we did, not a machine learning method. We made the related updates by changing the article title to reflect properly what we did. We removed the "machine learning" wordings in the article and emphasized what we did with the rule-based classification approach.

Comments 3: The article does not provide information on the distribution of sample sizes of different types of agricultural crops during key phenological phases, nor does it indicate weather and light conditions during remote sensing image acquisition. The lack of this information may limit the model's ability to capture the characteristics of the full growth cycle, which will affect classification accuracy and generalizability. Adding this information would increase the rigor of the study and the reliability of its results.

Responses 3: The article was updated. The section "3.2 Verifying the accuracy of developed methods with route survey data", emphasized the distribution sampling with each crop area sizes for 2018 and 2022, showing the field data. The information on the classification accuracy for each culture (Table 4) has provided more detailed information of the representativeness of our sample.

Comments 4: Title of the article: "Application of remote sensing with machine learning for monitoring crop soils in Southern Kazakhstan". The author is recommended to provide a detailed description of the machine learning algorithms used. In addition, it is necessary to include comparative analysis with other widely used algorithms or machine learning models in recent crop classification studies, highlighting the advantages and limitations of the selected methods.

Responses 4: Yes, we agree that the original title, abstract, and content of the article have discrepancies. We updated our title as " Automatic Classification of Agricultural Crops Using Sentinel-2 Data in the Rainfed Zone of Southern Kazakhstan" to make in line what we did. We also made appropriate changes to the abstract to reflect the subject and content of our manuscript. You are right that our study was without soil monitoring and machine learning methods applications. We worked on the agricultural crops’ classification activities. We apologize for the original misunderstanding.

Comments 5: The description of data preprocessing steps is not sufficient. The author should describe in detail the preprocessing methods and parameters used (for example, data cleaning, normalization). Clear and detailed description of the pre-processing process It is critical to ensure data quality, improve model training stability, improve classification accuracy, and improve transparency and reproducibility.

Responses 5: We have updated our article with related additions. Clarifications were provided in the removing clouds and shadows of how this process was carried out and described in our article. This stage has been improved with the mapping flow chart visualization. Data normalization and scaling process were updated with the adding the flow chart. These steps include the value scaling by reducing the raster data values to a range from 0 to 1. Data format conversions also were updated by converting data from a 16-bit format to a 1-bit format. The Sentinel-2A data processing with atmospheric correction to the level of the earth's surface reflection (Level 2A) has been also updated with adding to the flow chart.

Comments 6: Authors are encouraged to clearly indicate the limitations of the study in the conclusion and suggest directions for future research to promote further progress in this area.

Responses 6: Limitations of the study and directions for future work were expanded. When applying the developed methodologies, it should be considered that classification based on rules may require additional calibration of NDVI and PLI thresholds when used in other agroclimatic conditions. In addition, the absence of digitized field contour maps for the entire region is the main limitation for the scalability of the most accurate technique. In this regard, one of the most promising areas for future research is related to the integration of machine learning methods, which will improve the accuracy and adaptability of classification, especially in an automated approach, it is advisable to integrate machine learning algorithms such as Random Forest or Support Vector Machine.

Comments 7: The author is also recommended to expand the discussion section by analyzing the adaptability of the model to different climatic conditions and studying its scalability to other regions, which will clarify the applicability and generalizability of the approach. In addition, evaluating the model's performance on different soil types will help confirm its reliability and sustainability.

Responses 7: The discussion section was expanded. We made appropriate changes to the manuscript. Our work was without soil monitoring and machine learning methods applications. We worked on the agricultural crops’ classification activities. We emphasized more on the work activities what we did. We apologize for the original misunderstanding.

Reviewer 3 Report

Comments and Suggestions for Authors

Major Comments:

1) The novelty of the study could be better emphasized. Are there aspects (e.g., localized calibration of PLI, new rule-based logic, scaling automation to other semi-arid regions) that distinguish it from existing studies?

2) Add a brief comparison with similar recent works in Kazakhstan or other semi-arid zones in Introduction.

3) Clarify the objectives in the end od introduction.

4) The title and abstract emphasize machine learning, but the methodology section does not describe any actual machine learning models (e.g., random forest, SVM, deep learning).

5) Revise the title to reflect the methodology more accurately or clarify how machine learning is specifically integrated into the process.

6) The accuracy assessment relies heavily on overall classification accuracy and percentage deviation from official statistics. While helpful, additional metrics such as confusion matrices, Kappa coefficients, or F1 scores would improve the rigor of the validation.

7) Include pixel-based or object-based accuracy assessments using standard evaluation metrics, especially for the automated method.

8) The manuscript does not address uncertainties such as cloud contamination, temporal mismatch in phenological stages, or differences in crop management practices among farms.

9) Add a paragraph discussing the limitations and potential sources of misclassification, especially for crops like safflower that show spectral overlap.

10) The PLI plays a critical role in differentiating crops, but its calculation and sensitivity are not well explained. The reference to [45] is insufficient for replication.

11) Provide the full formula, spectral bands used, thresholds, and a sensitivity analysis if possible.

Minor Comments

1) There are several instances of awkward phrasing and grammatical issues (e.g., "getting popular worldwide", "crop soil agricultural sustainable management").

2) A thorough language edit by a native English speaker or editing service would improve readability.

3) Figures 12–15 (maps) are central to the validation section but would benefit from legends, clear labels, and improved resolution.

4) Tables 2–5 should have clearer titles and captions. For example, clarify whether the values are % errors, areas (ha), or pixel counts.

5) Several references appear with inconsistent formatting (e.g., [22,23], [31-33]) and might need formatting according to the journal’s guidelines.

Author Response

Comments 1: Could have better emphasized the novelty of the study. Are there aspects (such as local PLI calibration, new rule-based logic, scaling automation to other semi-arid regions) that distinguish it from existing research?

Responses 1: Comparative evaluation of methodologies: Our study is emphasized on the comparison of the two classification approaches for the rainfed crops: one based on labor intensive field work investigations, and the second-on the automated, scalable methodology using a wide area of rainfed agriculture. Such a comparative analysis is applicable for the big areas, big agricultural facilities as Kazakhstan. This methodology approach is specifically applicable for the regions like the South Kazakhstan agroclimatic conditions. We applied the rule-based classification logic, as well as a locally calibrated Plowed Land Index (PLI), which has been optimized to detect ploughed land specifically in the foothill and arid zones of the Turkestan region.

Comments 2: Add in the Introduction a brief comparison with similar recent work in Kazakhstan or other semi-arid zones.

Responses 2: The review of the similar works were updated

Comments 3: Specify your goals at the end of the introduction.

Responses 3: The goal of the study is to compare the two classification approaches for the rainfed crops: one based on labor intensive field work investigations, and the second-on the automated, scalable methodology using a wide area of rainfed agriculture. Such a comparative analysis is applicable for the big areas, big agricultural facilities as Kazakhstan. The pluses and minutes of these two approaches are presented by the quality of the result outputs.

Comments 4: The title and annotation emphasize machine learning, but the methodology section does not describe any real-world machine learning models (for example, random forest, SVM, deep learning).

Comments 5: Revise the title to better reflect the methodology or explain exactly how machine learning is integrated into this process.

Responses 5: We have updated our article based on your recommendations and provided more clarifications. We emphasized more on the rule-based classification, what we did, not a machine learning method. We made the related updates by changing the article title to reflect properly what we did. We removed the "machine learning" wordings in the article and emphasized what we did with the rule-based classification approach. We updated our title as " Automatic Classification of Agricultural Crops Using Sentinel-2 Data in the Rainfed Zone of Southern Kazakhstan" to make in line what we did.

Comments 6: The accuracy estimate largely depends on the overall accuracy of the classification and the percentage deviation from official statistics. While additional metrics such as error matrices, Kappa coefficients, or F1 scores may be useful, they could increase the rigor of the validation.

Responses 6: Our approach focuses on estimating areas at the level of aggregated geographical units (districts and rainfed areas) with less emphasis on the pixel-by-pixel accuracy, which are applicable for the error matrices and the Kappa coefficient. The main task of this study was to estimate the total area of crops in large territories in comparison with official statistics and validation at the farm level.

Comments 7: Include pixel-or feature-based accuracy estimates using standard evaluation metrics, especially for the automated method.

Responses 7: We applied the feature-based accuracy to estimate the total area of crops in large territories in comparison with official statistics and validation at the farm level.

Comments 8: The manuscript does not address such uncertainties as cloud pollution, temporary inconsistencies in phenological stages, or differences in crop cultivation methods in different farms.

Responses 8: The uncertainty analysis was updated. For cloud and shadow pollution the SCL (Scene Classification Layer) mask pre-processing were applied to remove clouds and their shadows. For the differences in cultivation methods our analysis is based on field studies that covered four main types of rainfed crops: winter cereals, spring cereals, safflowers and perennial crops. In the studied farms, these crops were cultivated, and the terms of sowing, growth and harvesting were approximately the same, which made it possible to develop general rules for classification. Factors such as temporal mismatch of phenological stages and microclimatic differences are the main concerns in the uncertainty analysis.

Comments 9: Add a paragraph that discusses limitations and potential sources of misclassification, especially for crops such as safflower that have spectral overlap.

Responses 9: We have added a separate paragraph to the "Discussion" section, where we provided the limitation analysis of our approach and potential sources of errors, including the problem of spectral overlap, in particular for safflowers, and discussed the impact of heterogeneity of categories on classification accuracy.

Comments 10: PLI plays an important role in crop differentiation, but its calculation and sensitivity are not well explained. The link to [45] is not sufficient for playback.

Responses 10: we elaborated more and added additional references [25, 45]

Comments 11: Provide the full formula, spectral bands used, threshold values, and sensitivity analysis, if possible.

Responses 11: The additional details to the methods section (Section 2.4) were added. Plowed Land Index (PLI) was derived from the bright index of the Tasseled Cap transform. The formula for the index is the original linear combination of the values of the six spectral channels of Sentinel-2 satellite data: B2, B3, B4, B8, B11 and B12. When creating this index, each spectral channel was weighed and transformed using certain weighting factors. The signs in the formula were selected to optimize the detection of plowed land in the foothill and lowland zones of the Turkestan region. As a result, a new linear combination was obtained for detecting plowed land in the study region:

PLI=−0,0037×b2–0,1793×b3+0,5403×b4–0,5585×b8–0,1082×b11+0,3013×b12.

Arystanov, A., Karabkina, N., Sagin, J., Nurguzhin, M., King, R., & Bekseitova, R. (2023). Use of Indices Applied to Remote Sensing for Establishing Winter–Spring Cropping Areas in the Republic of Kazakhstan. Sustainability, 16(17), 7548. https://doi.org/10.3390/su16177548

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

This manuscript has comprehensively addressed all the key issues raised in the initial review and has been revised accordingly. The revised version shows significant improvement in clarity, accuracy, and scientific rigor.

The revised title, "Automatic Classification of Agricultural Crops Using Sentinel-2 Data in the Rainfed Zone of Southern Kazakhstan," along with the updated abstract and introduction, now accurately reflect the study's focus on crop type classification. This has made the manuscript's research focus clear and logically consistent. In the methodology section, inappropriate terminology has been adjusted and corrected, and a detailed description of the PLI index calculation has been added, which contributes to the transparency and reproducibility of this research.

I have no further revision suggestions.

Author Response

Comments 1:

In my opinion, this revised manuscript now meets the publication standards of this journal. I have no further revision suggestions.

Responses 1: Thank you very much for your time and the positive feedback !

Reviewer 2 Report

Comments and Suggestions for Authors

The author has thoroughly revised the paper in accordance with the reviewers' comments. Adding further comparative experiments on different crop classification methods would help validate the advantages and innovation of the proposed rule-based classification method. This would enhance the research's academic value and practical significance.

Comments on the Quality of English Language

Prior to publication, we recommend professional language editing to enhance clarity and fluency.

Author Response

Comments 1:

Comments on the Quality of English Language

Prior to publication, we recommend professional language editing to enhance clarity and fluency.

Responses 1:

Yes, thank you very much for the advice, a very important suggestion ! We emphasized that our future research should be dedicated to the comparison of the current rule-based method with machine learning methods such as Random Forest and SVM. Furthermore, the application of the developed methodologies may require additional calibration of NDVI and PLI threshold values when applied in other agro-climatic conditions. Such a comparison would make it possible to assess the relative strengths and weaknesses of each approach and to further validate the effectiveness of the current methodology.

Definitely, prior to publication, professional language editing will be provided.

Article Menu

Automatic Classification of Agricultural Crops Using Sentinel-2 Data in the Rainfed Zone of Southern Kazakhstan

Minor Comments

Further Information

Guidelines

MDPI Initiatives

Follow MDPI