Review Reports - Characterizing Crop Distribution and the Impact on Forest Conservation in Central Africa

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The paper presents an “Adaptive Month Matching (AMM) method” for improving transfer learning in cropland segmentation by aligning phenological stages between training and target areas using Sentinel-2 time series data. AMM effectively addresses phenological misalignment in transfer learning by leveraging NDVI profiles and Multi-Criteria Decision Analysis (MCDA), eliminating reliance on local cropland maps. The comparative evaluation of AMM against Iterative Month Matching (IMM) and Visual Interpretation Matching (VIM) adds robustness to the findings. Comprehensive use of Sentinel-2 and integration with Google Earth Engine (GEE) demonstrates scalability for large-scale applications. The U-Net with ResNet-34 encoder is well-justified, and hyperparameter tuning (e.g., early stopping, Adam optimizer) is thoroughly described.
There still some point need to be improved:

1) lines 251:“LSNTC and SSNTC had PA and UA accuracies of 40.0 percent ± 7.4 and 17.8 percent ± 3.9, 20.9 percent ± 1.8 and 38.7 percent ± 2.9, respectively. ” This indicated that some accuracy is relatively low (for those equal to 17.8), could you analysis the reason. If the accuracy is too low, something might wrong. And this method should be improved further.
2) Lines 344, “CAR had the least forest to LSTC conversion area of 0.1 km2 and 0 km2 in the year 2000 and 2022, respectively, while none was recorded for Equatorial Guinea” revise the 0 to the “no variations occur…“
3) Lines 160: “The entire reference data can be accessed through the GEE platform.”Please provide the specific links here.
4) The language should be further improved. eg. Lines 518: “Error! Reference source not found” “KM2” ; Lines 156, the symbol of “)”should be deleted.

Author Response

Response to Reviewer 1 Comments
1. Summary
Thank you very much for taking the time to review this manuscript. Please find the detailed responses below and the revisions/corrections highlighted/in track changes in the resubmitted files.
2. Questions for General Evaluation	Reviewer’s Evaluation	Response and Revisions
Does the introduction provide sufficient background and include all relevant references?	Yes/Can be improved/Must be improved/Not applicable
Are all the cited references relevant to the research?	Yes/Can be improved/Must be improved/Not applicable
Is the research design appropriate?	Yes/Can be improved/Must be improved/Not applicable
Are the methods adequately described?	Yes/Can be improved/Must be improved/Not applicable
Are the results clearly presented?	Yes/Can be improved/Must be improved/Not applicable
Are the conclusions supported by the results?	Yes/Can be improved/Must be improved/Not applicable
3. Point-by-point response to Comments and Suggestions for Authors
Comments 1: [lines 251:“LSNTC and SSNTC had PA and UA accuracies of 40.0 percent ± 7.4 and 17.8 percent ± 3.9, 20.9 percent ± 1.8 and 38.7 percent ± 2.9, respectively. ” This indicated that some accuracy is relatively low (for those equal to 17.8), could you analysis the reason. If the accuracy is too low, something might wrong. And this method should be improved further.]
Response 1: Thank you for pointing this out, while we agree and allude to the low UA for LSNTC, owing largely to high omission error of LSTNC reference sites (converted to mostly otherlands). We agree with this comment and have now taken measures to highlight this problem (as requested by the reviewer) and the possible cause of this. This can be found within the discussion section of the manuscript specifically between lines 350 and 351.
Comments 2: [Lines 344, “CAR had the least forest to LSTC conversion area of 0.1 km2 and 0 km2 in the year 2000 and 2022, respectively, while none was recorded for Equatorial Guinea” revise the 0 to the “no variations occur..]
Response 2: Many thanks for this observation. We have revised the sentence accordingly and now reads as “[In contrast, CAR had the least forest to LSTC conversion area of ~0.1 km2 and ‘no variation occur’ in the year 2000 and 2022, respectively, while none was recorded for Equatorial Guinea.]”
Comments 3: [Lines 160: “The entire reference data can be accessed through the GEE platform. ”Please provide the specific links here.]
Response 3: Thank you so much for raising this observation and we apologize for the oversight. We’ve now addressed this by adding a hypertext link to the ‘GEE Platform’ referenced in the sentence] “[In line 160 of the revised manuscript]”
Comments 4: [The language should be further improved. eg. Lines 518: “Error! Reference source not found” “KM2” ; Lines 156, the symbol of “)”should be deleted.]
Response 4: We Thank the Reviewer for pointing this out. We have doubled-check the errors and addressed all of them. The reference source error in Lines 518 has been corrected and now points to a valid research article. In addition, we also ensured that we used km² as opposed to KM2. Lastly, we also deleted the bracket sign in line 158.

Reviewer 2 Report

Comments and Suggestions for Authors

The authors proposed a new perspective crop classification by scale to map the agricultural land scape and assessment forest loss. The manuscript seems original, and the authors conducted a detailed analysis in this regard. However, before being accepted, I suggest the following modifications:

1)The area unit in the abstract is Km2, but in the table it is Km². I suggest to standardize the expression.

2)Line 93-95, this sentence appears in an inappropriate place, and it is recommended that it be changed.

3) Whether the reference data was obtained at the same time as the other data, and if not whether it had an impact on the results of the study.

4)Line 153, “the high resolution planet images”, should have description of source, time of acquisition, etc.

5) Subsection 3.1 has many mischaracterizations, for example, Line 249,”OL had a PA and UA of 99 percent±0.1 and 90 percent±0.3 ” should be modified to ”OL had a PA and UA of 90 percent±0.3 and 99 percent±0.1”, etc.

6) The graphs in the paper are not clear, it is recommended that the resolution of the graphs be increased.

7) In Line 91-94, the challenges faced by deep learning methods are mentioned, whether the deep learning models used in the paper ameliorate these problems, and if so, please describe them.

Author Response

Response to Reviewer 2 Comments
1. Summary
Thank you very much for taking the time to review this manuscript. Please find the detailed responses below and the corresponding revisions/corrections highlighted/in track changes in the re-submitted files
2. Questions for General Evaluation	Reviewer’s Evaluation	Response and Revisions
Does the introduction provide sufficient background and include all relevant references?	Yes/Can be improved/Must be improved/Not applicable
Are all the cited references relevant to the research?	Yes/Can be improved/Must be improved/Not applicable
Is the research design appropriate?	Yes/Can be improved/Must be improved/Not applicable
Are the methods adequately described?	Yes/Can be improved/Must be improved/Not applicable
Are the results clearly presented?	Yes/Can be improved/Must be improved/Not applicable
Are the conclusions supported by the results?	Yes/Can be improved/Must be improved/Not applicable
3. Point-by-point response to Comments and Suggestions for Authors
Comments 1: [The area unit in the abstract is Km2, but in the table it is Km2. I suggest to standardize the expression.]
Response 1: Thank you for pointing this out. We agree with this comment, and we’ve taken measures to ensure consistency in the use of km² across the entire article, as evident in the track changes document.
Comments 2: [Line 93-95, this sentence appears in an inappropriate place, and it is recommended that it be changed.]
Response 2: Many thanks for making this observation. We have moved the entire sentence which cuts across line 93-95 and appended it to line 91 to ensure that there’s a uniform flow.] “[Deep Learning (DL) techniques have further advanced crop mapping, using trainable semantic segmentation models that analyze large datasets. For example, [39] ap- plied deep learning to identify land use following deforestation in Central Africa. Similarly, [40,41] have also explored DL and satellite image integration to characterize small-scale and large-scale oil palm. Despite these advances, challenges remain, particularly in the generalization of models across regions due to geographical biases and insufficient training data.]”
Comments 3: [Whether the reference data was obtained at the same time as the other data, and if not whether it had an impact on the results of the study.]
Response 3: Many thanks for this important question. It is indeed the case that all datasets were acquired at the same time. Part of the reference data came from the field, while others came from secondary sources. What we did to ensure that the reference data is uniform and consistent across board is to compare the specific landcover class with high resolution image basemap and in such cases where the reference data is not aligned with the feature in the high-resolution image they excluded. We’ve taken measures to address this by shedding more light on this, specifically in Line 152-155.] “[We extracted reference data from the tree cover class, which were validated using high resolution planet images for the tree cover and other classes to ensure that they were still the corresponding classes during the model development phase. The planet image used in this study was a global monthly mosaic for February 2023 acquired at 5 meter spatial resolution.]”
Comments 4: [Line 153, “the high-resolution planet images”, should have description of source, time of acquisition, etc.]
Response 4: Many thanks for raising the observation. We have now included the details of the planet image, which was a global monthly mosaic product for February 2023 at 5-meter spatial resolution.] “[The planet image used in this study was a global monthly mosaic for February 2023 acquired at 5 meter spatial resolution.]”
Comments 5: [Subsection 3.1 has many mischaracterizations, for example, Line 249,”OL had a PA and UA of 99 percent±0.1 and 90 percent±0.3 ” should be modified to ”OL had a PA and UA of 90 percent±0.3 and 99 percent±0.1”, etc.]
Response 5: We thank the reviewer most sincerely for identifying this particular issue and we are grateful. We have gone through the entire Subsection 3.1. (Line 252 – 259) and re-arranged the entire PA and UA values to align in order the results in presented based on Table 1] “[Specifically, OL had a PA and UA of 90 percent ± 0.3 and 99 percent ± 0.1, while LSOP and SSOP had PA and UA of 88.3 percent ± 8.6 and 91.2 percent ± 2.5, 82.7 percent ± 8.2 and 52.3 percent ± 8.6, respectively. Similarly, LSNTC and SSNTC had PA and UA accuracies of 40.0 percent ± 7.4 and 17.8 percent ± 3.9, 38.7 percent ± 2.9 and 20.9 percent ± 1.8, respectively. In addition, LSTC and SSTC had PA and UA of 88.2 percent ± 5.1 and 60.5 percent ± 6.4, 63.5 percent ± 5.9 and 5.2 percent ± 0.8, respectively. Figure 3 shows the crop map layer for Central Africa by scale of production. Table 1 and Table 2 show the error matrix and the spatial extent for each of the classes.]”
Comments 6: [The graphs in the paper are not clear, it is recommended that the resolution of the graphs be increased.]
Response 6: We thank the Reviewer most sincerely for making this observation. We have made effort to re-export most of our image files to a higher resolution and this looks much better than the previous version.]
Comments 7: [In Line 91-94, the challenges faced by deep learning methods are mentioned, whether the deep learning models used in the paper ameliorate these problems, and if so, please describe them.]
Response 7: Thank you so much for this observation. We are aware of the inclinations of this statement, and we feel very obligated to provide or describe how the methods used in this study overcame the various challenges mentioned in the paragraph. It is however, premature to categorically say that the method overcame all of the challenges and gave a 100% accurate map, which is rather impossible, especially as there are no studies that have looked at the same classes and the map we developed. However, from previous assessments, and preliminary experimental scenarios implemented in the course of this study, we are confident that the use of the Unet + ResNet Encoding gave a better output than most machine learning algorithms did in discriminating the classes. In highlighting this, we provided some useful information within the Discussion Section (between lines 380 – 387) the unique importance of the results obtained, despite the several challenges in opening a new frontier in the characterization of crop-scale distribution mapping using EOS and DL Methods.

Reviewer 3 Report

Comments and Suggestions for Authors

1. Originality and Scientific Contribution:
The article addresses an important regional issue with novel spatial-scale analysis using DL.

2. Importance and Relevance:
Highly relevant for understanding land-use impacts on forest conservation in Central Africa.

3. Methodological Soundness:
The methodology is robust and aligned with the state of the art in agricultural remote sensing. The use of multiple sources (radar and optical), temporal data, and deep learning enabled the production of an unprecedented and policy-relevant land cover map, particularly valuable for environmental conservation. Nevertheless, the reduced accuracy in mapping smallholder plots and the confusion between certain classes highlight the need for additional ground-truth data and refinement of region-specific classification models.

The U-Net + ResNet-50 segmentation method is appropriate. Comment on ground truth limitations is appreciated.

4. Quality and Clarity of Results:
Clear maps and accuracy tables, though some classes show low UA/PA (notably SSNTC and LSNTC).

5. Discussion and Conclusions:
Well-structured, addressing implications, although some conclusions could benefit from clearer linkage to results.

6. Manuscript Structure and Writing Quality:
Generally well written. Some grammatical refinement is suggested in the introduction and discussion.

7. Bibliographic Review:
Adequate and current, covering Sentinel-based studies and regional deforestation literature.

8. Scope Alignment:
Fits the scope of Remote Sensing journal well, integrating methodological and applied dimensions.

9. Technical Components:
Figures and tables are informative. Consider improving figure resolution and caption clarity in Figures 3–7.

10. Suggestions for Improvement:
Expand the explanation of misclassification causes; clarify training data availability and spatial bias issues; improve clarity in the description of crop class boundaries and definitions.

Some confusion may arise from the lower accuracy of certain classes (e.g., SSNTC and SSTC)

Author Response

Response to Reviewer 3 Comments
1. Summary
Thank you very much for taking the time to review this manuscript. Please find the detailed responses below and the corresponding revisions/corrections highlighted/in track changes in the re-submitted files. [This is only a recommended summary. Please feel free to adjust it. We do suggest maintaining a neutral tone and thanking the reviewers for their contribution although the comments may be negative or off-target. If you disagree with the reviewer's comments please include any concerns you may have in the letter to the Academic Editor.]
2. Questions for General Evaluation	Reviewer’s Evaluation	Response and Revisions
Does the introduction provide sufficient background and include all relevant references?	Yes/Can be improved/Must be improved/Not applicable
Are all the cited references relevant to the research?	Yes/Can be improved/Must be improved/Not applicable
Is the research design appropriate?	Yes/Can be improved/Must be improved/Not applicable
Are the methods adequately described?	Yes/Can be improved/Must be improved/Not applicable
Are the results clearly presented?	Yes/Can be improved/Must be improved/Not applicable
Are the conclusions supported by the results?	Yes/Can be improved/Must be improved/Not applicable
3. Point-by-point response to Comments and Suggestions for Authors
Comments 1: [Originality and Scientific Contribution: The article addresses an important regional issue with novel spatial-scale analysis using DL]
Response 1: [We thank the Reviewer for Making this observation]
Comments 2: [Importance and Relevance: Highly relevant for understanding land-use impacts on forest conservation in Central Africa.]
Response 2: [We thank the Reviewer for Making this observation and glad our thoughts are in alignment on the importance and relevance of this study ]
Comments 3: [Methodological Soundness: The methodology is robust and aligned with the state of the art in agricultural remote sensing. The use of multiple sources (radar and optical), temporal data, and deep learning enabled the production of an unprecedented and policy-relevant land cover map, particularly valuable for environmental conservation. Nevertheless, the reduced accuracy in mapping smallholder plots and the confusion between certain classes highlight the need for additional ground-truth data and refinement of region-specific classification models. The U-Net + ResNet-50 segmentation method is appropriate. Comment on ground truth limitations is appreciated.]
Response 3: [We thank the Reviewer for Most sincerely for this comment]. It is indeed true that the map produced lacked accuracy in certain quarters due to spectral confusion among some classes. We believe access to more field data can improve the outcome of this study and hope that future studies can build on this further.]
Comments 4: [Quality and Clarity of Results: Clear maps and accuracy tables, though some classes show low UA/PA (notably SSNTC and LSNTC).]
Response 4: [We thank the Reviewer for Making this observation]
Comments 5: [Discussion and Conclusions: Well-structured, addressing implications, although some conclusions could benefit from clearer linkage to results.]
Response 5: We thank the reviewer for taking time out to review this manuscript and we appreciate the comment in this regard. We have tried to improve on the narrative of the discussion in linking with the results, specifically along the lines of 362-374.]
Comments 6: [Manuscript Structure and Writing Quality: Generally well written. Some grammatical refinement is suggested in the introduction and discussion.]
Response 6: [We thank the Reviewer for Making this observation.] We made effort in addressing specific areas mentioned in the Introduction Section, especially between lines 87-102.
Comments 7: [Bibliographic Review: Adequate and current, covering Sentinel-based studies and regional deforestation literature.]
Response 7: [We thank the Reviewer for Making this observation]
Comments 8: [Scope Alignment: Fits the scope of Remote Sensing journal well, integrating methodological and applied dimensions..]
Response 8: [We thank the Reviewer for Making this observation]
Comments 9: [Technical Components: Figures and tables are informative. Consider improving figure resolution and caption clarity in Figures 3–7.]
Response 9: [We thank the Reviewer for Making this observation.] We have made effort to re-export most of our image files to a higher resolution and this looks much better than the previous version.
Comments 10: [Suggestions for Improvement: Expand the explanation of misclassification causes; clarify training data availability and spatial bias issues; improve clarity in the description of crop class boundaries and definitions. Some confusion may arise from the lower accuracy of certain classes (e.g., SSNTC and SSTC)]
Response 10: [We thank the Reviewer for Making this observation.] We have taken measures to address these concerns by providing more details within the discussion section, specifically between line 352-365. “[While the overall accuracy assessment of our study was 86.9 percent, we observed significant misclassification of SSNTC reference sites as OL (i.e. grassland and shrubland), LSTC, and SSTC. In addition, significant errors of omission also led to LSNTC often classified being as OL, LSOP and SSNTC. This could be associated to the diverse farming practice within this region, as more often than not farmers engage in mixed-farming [53–56], rotational farming [57,58] and intercropping [59], where food crops (such as cassava, maize, yam, etc.) are cultivated alongside tree crops to fulfill both short-term needs to sustain livelihood and long-term needs of generating income [60–63]. Furthermore, we also observed misclassification of shrubland as SSNTC and LSNTC mostly in the savanna dominated (northern and southern) part of the region, especially in central parts of Cameroon, CAR and the southern parts of DRC. We believe this could be because of the dryland ecosystem in the northern and southern part of the CA region [64], which is characterized with low spectral diversity between grassland and cropland [65], causing a problem for the classifier.]”

Reviewer 4 Report

Comments and Suggestions for Authors

Please see attached

Comments for author File: Comments.pdf

Comments on the Quality of English Language

Sentence structure must be improved

Author Response

Response to Reviewer 4 Comments
1. Summary
Thank you very much for taking the time to review this manuscript. Please find the detailed responses below and the corresponding revisions/corrections highlighted/in track changes in the re-submitted files.
2. Questions for General Evaluation	Reviewer’s Evaluation	Response and Revisions
Does the introduction provide sufficient background and include all relevant references?	Yes/Can be improved/Must be improved/Not applicable
Are all the cited references relevant to the research?	Yes/Can be improved/Must be improved/Not applicable
Is the research design appropriate?	Yes/Can be improved/Must be improved/Not applicable
Are the methods adequately described?	Yes/Can be improved/Must be improved/Not applicable
Are the results clearly presented?	Yes/Can be improved/Must be improved/Not applicable
Are the conclusions supported by the results?	Yes/Can be improved/Must be improved/Not applicable
3. Point-by-point response to Comments and Suggestions for Authors
Comments 1: [Abstract- Line 11-15; break down it to two sentences, as it's hard to read]
Response 1: [We thank the Reviewer for Making this observation.] We have now broken it down into sentence. “[n general, we observed that small-scale agriculture is fifteen times the size of large-scale agriculture as area estimates of small-scale non-tree crops and small-scale tree crops ranged between 164,823 ± 4,224 Km2 and 293,249 ± 12,695 Km2, respectively. Large-scale non-tree crops and large-scale tree crops ranged between 20,153 ± 1,195 Km2 and 7,436 ± 280 Km2, respectively.]”
Comments 2: [Line 26- use 2050 once]
Response 2: [We thank the Reviewer for Making this observation.] The second use of 2050 i.e. ‘by 2050’ has been deleted and now reads as. “[The World Summit on Food Security stated that in 2050, “The world’s population is expected to grow to almost 10 billion, boosting agricultural demand - in a scenario of modest economic growth - by some 50 percent compared to 2013” [2].]”
Comments 3: [Line 30-32- rewrite the sentence]
Response 3: [We thank the Reviewer for Most sincerely for this comment]. The sentence has now been rewritten. “[Both agricultural intensification and land expansion frequently harm natural ecosystems and biodiversity, and can also negatively affect people—for instance, through pollution caused by agrochemicals [3–5].]”
Comments 4: [Line 41-44 - rewrite the sentence.]
Response 4: [We thank the Reviewer for Making this observation.] The sentence has now been rewritten and now reads as: “[The expansion of cropland and pasture is occurring primarily at the expense of forests, savannahs, and grasslands. In tropical regions in particular, forests face continuous pressure from disturbance and conversion to agriculture and other land uses [12].]”.
Comments 5: [Line 87,90, 94. 96, mention the author's name and then cite the paper -For example, [39] applied—instead write X et al applied]
Response 5: We thank the reviewer for taking time out to review this manuscript and we appreciate the comment in this regard. We’ve now gone through the manuscript and addressed all the identified grey areas in the citation.]
Comments 6: [Line 115- Study -s should be small alphabet.]
Response 6: [Many thanks for this, the word now reads as study and not Study anymore].
Comments 7: [Line 147- The Reference Land Cover data represents conditions from 2015-2020, specify conditions of what?]
Response 7: [We thank the Reviewer for Making this observation.] We’ve now reworked the sentence to reflect the appropriate on the ground features. Thus the sentence now reads as: “[The Reference Land Cover data represents on the ground condition of land cover features from 2015-2020,.]”
Comments 8: [The term “parcels” is not well-defined. Are these field polygons or point samples?]
Response 8: [We thank the Reviewer for Making this observation and accept our humble apologies for the ambiguity this presents.] The term parcels here represents field boundary for a particular crop captured as polygon. We’ve now taken measures to improve on the clarity of the term by providing more information in this regard. “[We obtained crop type ground truth reference point data (totaling 1,540) from 140 parcels in Cameroon and DRC. Field parcels here represents field boundary for a particular crop captured as polygon in a GIS software. The parcels were established during a field work exercise conducted in Cameroon to establish crop types between 7th December 2023 to 19th January 2024. In addition, we obtained data to establish crop type parcels for Central DRC collected between 12th and 17th of February 2020 in the context of another study [42].]”
Comments 9: [Line 197- cite U-Net model.]
Response 9: [Many thanks for this, we’ve now included a citation for the U-Net model].
Comments 10: [Figures 5 and 6 caption are same? Correct it]
Response 10: [We thank the Reviewer for Making this observation.] This is indeed true. We’ve altered the caption in Figure 6 to read as DRC while Figure 5 reads as both Cameroon and DRC.
Comments 11: [Line 352- rewrite the sentence]
Response 11: [We thank the Reviewer for Making this observation.] The entire paragraph has now been revised and reads as: “[In addition, significant errors of omission often led to LSNTC being misclassified as OL, LSOP, and SSNTC. This may be related to the diverse farming practices in the region, where farmers frequently engage in mixed farming [53–56], rotational farming [57,58], and intercropping [59], cultivating food crops (such as cassava, maize, and yam) alongside tree crops to meet both short-term livelihood needs and long-term income goals [60–63].]”
Comments 12: [Line 400-402- rewrite the sentence]
Response 12: [We thank the Reviewer for Making this observation and glad for making these corrections.] The revised sentence now reads as: “[In contrast, large-scale plantations and agricultural operations, typically used to cultivate tree crops such as oil palm, cocoa, and rubber, tend to remain stable over long periods. Once established, they involve relatively minimal periodic interference with the forest ecosystem.]”
Comments 13: [Line 404- it should be ‘cannot be overlooked’]
Response 13: [Thanks so much for this. Correction has now been implemented in the revised version.]
Comments 14: [Line 411-414- break it into two sentences.]
Response 14: [We thank the Reviewer for Making this observation.] The sentence has now reworked into two sentences and now reads as: “[It is therefore suggested that both new and existing policy frameworks must address and balance the demand for land to support subsistence agriculture. At the same time, they should ensure the conservation of forest ecosystems and promote biodiversity within agricultural landscapes.]”
Comments 15: [Line 446-450- rewrite the sentences]
Response 15: We thank the reviewer for taking time out to review this manuscript and we appreciate the comment in this regard. We have now reworked the sentence, and now reads as: “[Similarly, there is a pressing need for a comprehensive transformation of food systems in the region. This transformation should involve governments, farmers, industries, financial institutions, scientists, and civil society working together to identify situations where the negative impacts of agricultural expansion outweigh its benefits—considering environmental, social, and economic factors. Such evaluations can guide the adoption of greener and more sustainable farming practices [77]. In addition, strong governance and increased conservation incentives can support land-sparing strategies [78] and enable targeted development planning that avoids placing key infrastructure, such as roads and buildings, in core forest areas.]”
Comments 16: [It’s great that you used multimodal data. However, please mention explicitly in the input variable subsection that you used the SAR product and optical data first- it’s for a non-background audience]
Response 16: We thank the reviewer for taking time out to review this manuscript and we appreciate the comment in this regard. We incorporated the use of SAR and Optical image products into the beginning part of the Input Variable Section. This now reads as: “[We used Synthetic Aperture Radar (SAR) and Multispectral Optical image of the European Space Agency’s. The SAR data is a Sentinel-1 C Band Level 1 Ground Range Detected (GRD) product through Google Earth Engine (GEE). The image covering the study area consisted of reflected beams in the Vertical-Vertical (VV) and Vertical-Horizontal (VH) polarizations.]”

Comments 17: [Pretrained U-Net/ResNet-50: There’s no discussion about the limitations of using a network pretrained on ImageNet (natural RGB images) for remote sensing multispectral and SAR data. This likely reduces the feature transferability and may be partly responsible for the low classification accuracy of SSNTC and SSTC. ---Provide justification or experiment with a model pretrained on geospatial data (e.g., SEN12MS or BigEarthNet).]
Response 17: We thank the reviewer for taking time out to review this manuscript and we appreciate the comment in this regard. We’ve taken measures to address by providing more justification for using a pretrained network on ImageNet and transfer learning in this study (line 228-229). We believe this method and approach have very small limitations as this has been demonstrated and used in previous studies with very similar application, and the performance was of high accuracy. https://essd.copernicus.org/articles/13/1211/2021/.
Comments 18: [Input Feature: Only Sentinel-2 Band 4 was used—this is surprising since other bands like NIR (Band 8) and red-edge bands (5, 6, 7) are known to be highly sensitive to vegetation traits. ---Justify the band selection rigorously or consider including an ablation study to show the impact of additional bands.
Response 18: We thank the reviewer for providing this observation. The use of the bands was based on the outcome of previous studies with similar applications and based on initial experimental analysis implemented during this study to identify by Sentinel-1 and Sentinel-2 products that offered better discriminatory outputs among the target classes. The justification has now been provided between lines 134-136
Comments 18: [Forest Masking Strategy: The approach uses a static 2023 forest mask but performs back- in-time forest conversion analysis to 2000. This likely introduces error, as areas converted prior to 2023 would have already been masked out. ---You should use dynamic forest masks for each time interval.
Response 18: We thank the reviewer for this observation. To clarify, the 2023 forest mask was used to mask out forested areas in the composite used for the Semantic Segmentation to generate the Crop-scale Layer. The reason for this was to reduce spectral interference and exclude forest from cropland areas so that only (strictly speaking) cropland layer is established. The resulting Cropland Area was used to assess forest change dynamic (i.e. to establish for example areas that were previously forest in 2000 that changed to cropland in 2005 or remained as forest as of 2005. This in general helped to establish at what point in time in the 5-year interval did forest give way to cropland, as well as quantifying the area of forest loss (in figure 8). Basically, we are assessing forest to cropland conversion rate over the 20 years period, as such we cannot employ a dynamic forest masks.

Reviewer 5 Report

Comments and Suggestions for Authors

This study employs satellite remote sensing data and deep learning methods to classify agricultural land (distinguishing between large- and small-scale tree and non-tree crops) in Central Africa, and assesses the impact of agricultural expansion on forest loss from 2000 to 2022. The findings reveal that small-scale agriculture occupies a much larger area than large-scale agriculture and highlight its encroachment on tropical rainforests. The topic is of high significance, providing valuable data and perspectives to understand deforestation drivers in Central Africa and to inform conservation policies. The methodology is innovative, particularly in its attempt to distinguish agricultural production scales.

However, there are some concerns regarding methodological clarity, the robustness of classification accuracy in supporting conclusions, and the depth of results discussion that require revision and improvement.

Provide a clearer explanation of the definition of “scale” and how it is applied in the classification. The classification into "small-scale" and "large-scale" agriculture is critical to the study’s main argument. However, the definitions provided in Section 2.4 (e.g., LSTC - Large-scale Tree Crops, LSNTC - Large-scale Non-Tree Crops, SSTC - Small-scale Tree Crops, SSNTC - Small-scale Non-Tree Crops) rely on a combination of NDVI thresholds, visual characteristics (e.g., field homogeneity, road network), and assumed farming practices (e.g., level of mechanization). The authors are advised to more explicitly discuss to what extent the study's conclusions would be undermined if these definitions lack robustness (e.g., absence of clear, operational area thresholds or consistent morphological indicators). Emphasizing this point early in the paper will clarify the critical importance of this definition to the study’s findings.
Conduct a more in-depth analysis of how the low classification accuracies may affect the core conclusions of the study, especially regarding the estimated extent and impact of small-scale agriculture, and interpret the results with greater caution in the discussion. The paper acknowledges that key classes have relatively low User Accuracy (UA) and Producer Accuracy (PA), particularly LSNTC (UA: 17.8%), SSNTC (PA: 20.9%, UA: 38.7%), and especially SSTC (PA: 5.2%). Given that SSNTC and SSTC are considered the main forms of agriculture and the primary drivers of forest encroachment, these low accuracies severely weaken the reliability of area estimates and the derived impact assessments (e.g., the conclusion that small-scale agriculture occupies 15 times more area than large-scale agriculture). Although the authors mention “error-adjusted area estimates,” the extremely low PA of SSTC indicates a substantial omission of true SSTC areas. The authors must more thoroughly discuss how these uncertainties affect the reliability of their key conclusions and policy recommendations. The extremely low PA of 5.2% for SSTC is particularly problematic.
Explain in detail how the temporal inconsistencies between the different data sources were handled. The classification is mainly based on Sentinel imagery from 2021–2023/2024, while reference data sources vary in date, such as CHLCC data (2015–2020) and some field data from the Democratic Republic of the Congo (2020). Although the authors state that Planet imagery was used to update some reference points for tree crops, the potential impacts of temporal mismatches on the quality of reference data and classification accuracy need to be explicitly discussed. How did the authors address land cover changes between the reference data collection periods and the satellite imagery acquisition in such a dynamic agricultural landscape?
Provide more comprehensive details on the methodology used to generate the training masks. Section 2.7 explains that 350 labeled training masks (10 km × 10 km) were generated from 9313 reference points (50% used for training). However, the process of converting point or polygon data into area-based training masks is not clearly described. The authors should clarify how these masks were created (e.g., manual digitization from points, buffering), and how representativeness across the seven defined classes within the masks was ensured. This step is essential for understanding the model’s performance and the study’s reproducibility.
Discuss the treatment and potential influence of the “Other Land” class and clarify how the forest mask was applied in the long-term temporal change analysis. After masking out forests, the "Other Land" (OL) class covers most of the remaining area but includes highly heterogeneous land covers (e.g., water, built-up areas, wetlands, bare soil, grasslands, shrublands). The authors mention misclassification between SSNTC and some OL components (e.g., grass/shrubland). It is recommended to discuss whether the spectral heterogeneity within OL complicates the distinction of small-scale agricultural fields, which may resemble natural vegetation or fallow land. In addition, the forest mask applied in Section 2.6 uses JRC 2023 forest cover data. Since the forest-to-agriculture change analysis spans 2000–2022, did the authors also use the 2023 forest mask as the baseline for early years (e.g., 2000)? Or did they apply corresponding historical forest data consistently for each analysis period?

Author Response

Response to Reviewer 5 Comments
1. Summary
Thank you very much for taking the time to review this manuscript. Please find the detailed responses below and the corresponding revisions/corrections highlighted/in track changes in the re-submitted files
2. Questions for General Evaluation	Reviewer’s Evaluation	Response and Revisions
Does the introduction provide sufficient background and include all relevant references?	Yes/Can be improved/Must be improved/Not applicable	[Please give your response if necessary. Or you can also give your corresponding response in the point-by-point response letter. The same as below]
Are all the cited references relevant to the research?	Yes/Can be improved/Must be improved/Not applicable
Is the research design appropriate?	Yes/Can be improved/Must be improved/Not applicable
Are the methods adequately described?	Yes/Can be improved/Must be improved/Not applicable
Are the results clearly presented?	Yes/Can be improved/Must be improved/Not applicable
Are the conclusions supported by the results?	Yes/Can be improved/Must be improved/Not applicable
3. Point-by-point response to Comments and Suggestions for Authors
Comments 1: [Provide a clearer explanation of the definition of “scale” and how it is applied in the classification. The classification into "small-scale" and "large-scale" agriculture is critical to the study’s main argument. However, the definitions provided in Section 2.4 (e.g., LSTC - Large-scale Tree Crops, LSNTC - Large-scale Non-Tree Crops, SSTC - Small-scale Tree Crops, SSNTC - Small-scale Non-Tree Crops) rely on a combination of NDVI thresholds, visual characteristics (e.g., field homogeneity, road network), and assumed farming practices (e.g., level of mechanization). The authors are advised to more explicitly discuss to what extent the study's conclusions would be undermined if these definitions lack robustness (e.g., absence of clear, operational area thresholds or consistent morphological indicators). Emphasizing this point early in the paper will clarify the critical importance of this definition to the study’s findings..]
Response 1: Thank you for pointing this out. We agree with this comment, and we’ve taken measures to address this by providing a clearer definition of the term ‘scale’ between line 174-178.
Comments 2: [Conduct a more in-depth analysis of how the low classification accuracies may affect the core conclusions of the study, especially regarding the estimated extent and impact of small-scale agriculture, and interpret the results with greater caution in the discussion. The paper acknowledges that key classes have relatively low User Accuracy (UA) and Producer Accuracy (PA), particularly LSNTC (UA: 17.8%), SSNTC (PA: 20.9%, UA: 38.7%), and especially SSTC (PA: 5.2%). Given that SSNTC and SSTC are considered the main forms of agriculture and the primary drivers of forest encroachment, these low accuracies severely weaken the reliability of area estimates and the derived impact assessments (e.g., the conclusion that small-scale agriculture occupies 15 times more area than large-scale agriculture). Although the authors mention “error-adjusted area estimates,” the extremely low PA of SSTC indicates a substantial omission of true SSTC areas. The authors must more thoroughly discuss how these uncertainties affect the reliability of their key conclusions and policy recommendations. The extremely low PA of 5.2% for SSTC is particularly problematic..]
Response 2: Many thanks for making this observation. We’ve tried to address this by providing detailed implication for such low classification accuracies within the discussion section of the manuscript, especially between line 386-392. In addition, the conclusion reached in the study in terms of the impact of small-scale agriculture on deforestation wasn’t just based on our results. This was also the position of several other articles we referenced in the study. It is also worthy to note that the spatial extent of the crop layer classes very much aligns with the extents of products like ESA and Potapov cropland layer. We believe the low accuracies are as a result of the lack of extensive and reliable ground truth which we highlighted limitation part of the manuscript.
Comments 3: [Explain in detail how the temporal inconsistencies between the different data sources were handled. The classification is mainly based on Sentinel imagery from 2021–2023/2024, while reference data sources vary in date, such as CHLCC data (2015–2020) and some field data from the Democratic Republic of the Congo (2020). Although the authors state that Planet imagery was used to update some reference points for tree crops, the potential impacts of temporal mismatches on the quality of reference data and classification accuracy need to be explicitly discussed. How did the authors address land cover changes between the reference data collection periods and the satellite imagery acquisition in such a dynamic agricultural landscape?]
Response 3: Many thanks for this important question. It is indeed the case that all datasets were not acquired at the same time. Part of the reference data came from the field, while others came from secondary sources. What we did to ensure that the reference data is uniform and consistent across board is to compare the specific landcover class with high resolution image basemap and in such cases where the reference data is not aligned with the feature in the high-resolution image they excluded. We’ve taken measures to address this by shedding more light on this, specifically in Line 152-155.] “[We extracted reference data from the tree cover class, which were validated using high resolution planet images for the tree cover and other classes to ensure that they were still the corresponding classes during the model development phase. The planet image used in this study was a global monthly mosaic for February 2023 acquired at 5 meter spatial resolution.]”
Comments 4: [Provide more comprehensive details on the methodology used to generate the training masks. Section 2.7 explains that 350 labeled training masks (10 km × 10 km) were generated from 9313 reference points (50% used for training). However, the process of converting point or polygon data into area-based training masks is not clearly described. The authors should clarify how these masks were created (e.g., manual digitization from points, buffering), and how representativeness across the seven defined classes within the masks was ensured. This step is essential for understanding the model’s performance and the study’s reproducibility.]
Response 4: Many thanks for raising the observation. We have now included the details of the methodology as requested between line 236-244.] “[The training dataset from the reference data was further developed into a training mask by first creating a 10 km-by-10 km around the point(s) of interest. After which we used manual digitization to create the extent of the respective classes of tree- and non-tree crops based on a high-resolution image (as shown in Figure ~\ref{fig2}). The digitized shapefile extent were then converted into training masks of 10 km-by-10 km grids covering both the defined classes and other land cover types. In total we developed a total of 350 labeled masks from the training data, as several reference points were within proximity of the 10km Grids generated..]”
Comments 5: [Discuss the treatment and potential influence of the “Other Land” class and clarify how the forest mask was applied in the long-term temporal change analysis. After masking out forests, the "Other Land" (OL) class covers most of the remaining area but includes highly heterogeneous land covers (e.g., water, built-up areas, wetlands, bare soil, grasslands, shrublands). The authors mention misclassification between SSNTC and some OL components (e.g., grass/shrubland). It is recommended to discuss whether the spectral heterogeneity within OL complicates the distinction of small-scale agricultural fields, which may resemble natural vegetation or fallow land. In addition, the forest mask applied in Section 2.6 uses JRC 2023 forest cover data. Since the forest-to-agriculture change analysis spans 2000–2022, did the authors also use the 2023 forest mask as the baseline for early years (e.g., 2000)? Or did they apply corresponding historical forest data consistently for each analysis period?]
Response 5: We thank the reviewer most sincerely for identifying this particular issue and we are grateful. To clarify, the 2023 forest mask was used to mask out forested areas in the composite used for the Semantic Segmentation to generate the Crop-scale Layer. The reason for this was to reduce spectral interference and exclude forest from cropland areas so that only (strictly speaking) cropland layer is established. The resulting Cropland Area was used to assess forest change dynamic (i.e. to establish for example areas that were previously forest in 2000 that changed to cropland in 2005 or remained as forest as of 2005. This in general helped to establish at what point in time in the 5-year interval did forest give way to cropland, as well as quantifying the area of forest loss (in figure 8). Basically, we are assessing forest to cropland conversion rate over the 20 years period, as such we cannot employ a dynamic forest masks.

Round 2

Reviewer 5 Report

Comments and Suggestions for Authors

Authoes responded positively to all the questions and concerns in the previous round of review.

Author Response

We thank the reviewer for this positive feedback.