Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessCommunication

Peer-Review Record

Comparison of Habitat Selection Models Between Habitat Utilization Intensity and Presence–Absence Data: A Case Study of the Chinese Pangolin

Biology 2025, 14(8), 976; https://doi.org/10.3390/biology14080976

by Hongliang Dou, Ruiqi Gao, Fei Wu and Haiyang Gao^*

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Biology 2025, 14(8), 976; https://doi.org/10.3390/biology14080976

Submission received: 27 June 2025 / Revised: 24 July 2025 / Accepted: 30 July 2025 / Published: 1 August 2025

(This article belongs to the Section Conservation Biology and Biodiversity)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

Brief summary

This manuscript presents a well-structured comparison between habitat utilization intensity (continuous) and presence–absence (binary) data in identifying key habitat characteristics of the critically endangered Chinese pangolin. The study is methodologically sound, employing field-based burrow surveys and generalized additive models (GAMs) across a relevant set of environmental variables. A major strength lies in demonstrating that continuous data offer greater explanatory power, capture nonlinear ecological relationships, and better inform conservation planning. The paper makes a valuable contribution to conservation methodology by underscoring the importance of data type in habitat modeling, with clear implications for improving in situ conservation strategies for endangered species.

General comments

The manuscript is generally well- written. However, carefully check and fix minor typographical and grammatical errors throughout the manuscript.

Specific comments

L 23-24- Replace “focused the critically Chinese pangolin” with “focused on the critically endangered Chinese pangolin”
L 24- Use “600 × 600 m grids” instead of “600m 24 square grids”.
L 46-48- Paraphrase for improved clarity.
L 48-51- The sentence is too long. Revise it
L 52-56- The sentences are redundant. Consider merging and rephrasing them.
L 56-59- This is vague. What process? How does this sentence relate to the previous?

L 109- What does “sampling intensity” means here. Is it area coverage? Consider defining it.
L 151- Specify the version of “R” used.
The results section is generally well-structured.
L 210-214- Move the sentence to the end of Discussion as a concluding synthesis.
L 258- It is the interpretation or detection of habitat characteristics that data type shapes. Consider rephrasing for accuracy.

Author Response

#24-July-2025

Dear anonymous reviewer,

Thank you for your help and the valuable comments, we have revised the article according to your and the other reviewer’s comments. In the presence-absence model, we introduced the “offset” term to eliminate the effect of line transect lengths. We further modified the logic of the discussion section and avoided the overly subjective presentation. We revised the whole article to avoid grammatical and terminological problems.

Once again, we would like to thank the experts for reviewing our paper and giving us these valuable comments, which helped us to greatly improve the rigor of the paper.

Sincerely,

Haiyang Gao

Point to point responses are as follows:

# Reviewer 1

L 23-24- Replace “focused the critically Chinese pangolin” with “focused on the critically endangered Chinese pangolin”.

Reply：Thank you for the comment. We have corrected the phrasing to “focused on the critically endangered Chinese pangolin” in the revised manuscript.

L 24- Use “600 × 600 m grids” instead of “600m square grids”.

Reply：We have corrected it to “600m × 600 m grids” in the revised manuscript.

L 46-48- Paraphrase for improved clarity.

Reply：Thank you for the comment. We have modified those sentence into “Presence–absence data and habitat utilization intensity data might reflect specific responses to environmental variables and are widely used to analyze species–habitat associations, assess habitat suitability, identify threat factors, and guide targeted conservation practices [3,4].”.

L 48-51- The sentence is too long. Revise it.

Reply：We have revised the sentence to improve clarity and reduce its length in the revised manuscript. “However, the differences introduced by using different data types have often been overlooked. This can lead to inconsistencies in understanding the specific habitat requirements of endangered species, potentially hindering effective habitat conservation and management [5–7].”.

L 52-56- The sentences are redundant. Consider merging and rephrasing them.

Reply：We have revised this part by merging and rephrasing the sentences to eliminate redundancy and improve clarity in the revised manuscript. “Essentially, presence–absence data indicates whether a species selects or avoids a particular habitat, while habitat utilization intensity quantifies the frequency and degree of habitat use, providing additional ecological details [8]”.

L 56-59- This is vague. What process? How does this sentence relate to the previous?

Reply：Thank you for your suggestion. We have rephrased this part for better clarity and logical consistency. “Therefore, comparing and integrating these two data types allows for a more comprehensive understanding of the ecological processes involved–from initial habitat selection to subsequent habitat utilization.”.

L 109- What does “sampling intensity” means here. Is it area coverage? Consider defining it.

Reply：Thank you for your suggestion. “Sampling intensity” refers to the proportion of surveyed grids within the buffer zone, now we have modified the sentence into “...reaching 32.6% of the total grids within the 2km buffer zone...”.

L 151- Specify the version of “R” used.

Reply：Thank you for your reminder. We have specified the version of R (version 4.3.2) used.

L 210-214- Move the sentence to the end of Discussion as a concluding synthesis.

Reply：Thank you for your valuable suggestion. We have made revisions to this part by relocating the sentence to the end of the Discussion section as a concluding synthesis.

L 258- It is the interpretation or detection of habitat characteristics that data type shapes. Consider rephrasing for accuracy.

Reply：Thank you for the valuable comment. We have revised the sentence to clarify that it is the interpretation of habitat characteristics that is influenced by data type. “Our study demonstrates that data type profoundly influences the interpretation of habitat characteristics for the critically endangered Chinese pangolin.”.

Author Response File: Author Response.docx

Reviewer 2 Report

Comments and Suggestions for Authors

The study by Dou and collaborators addresses an important methodological question in habitat modeling: how different data types, such as habitat utilization intensity vs presence-absence, can influence the identification of key habitat characteristics. The authors employ Generalized Additive Models (GAMs) to evaluate the relationship between environmental variables and species occurrence or use, using burrow as a proxy for utilization. Their comparison of model performance and explanatory power provides valuable insights into how data structure affects ecological inference and, consequently, conservation decision-making.

However, there are some issues that should be considered before this article can be published in Biology. One of my primary concerns with the study lies in the data collection process, specifically the absence of any correction or adjustment for detection probability. The number of burrows per unit length of transect was used as a direct proxy for habitat utilization intensity. However, this approach assumes perfect detectability (that is, that all existing burrows along the transects were observed and recorded). This assumption is rarely valid in field conditions. Factors such as observer bias, burrow visibility, and vegetation cover can all influence detection rates, potentially leading to underestimation or spatial bias in the recorded utilization intensity or presence-absence data. Without addressing detectability (e.g., through occupancy models on repeated surveys), both the intensity and presence-absence models may reflect detection patterns rather than true ecological use. Alternatively, incorporating a distance-sampling framework, in which the distance from each detected burrow to the transect line is recorded, could help estimate the effective detection area. This would make it possible to model detection probability as a function of distance and adjust habitat use estimates accordingly. Then, by measuring environmental covariates within that effective sampling area (both in places with and without burrows) one could mitigate the potential bias introduced by imperfect detectability. I understand that it may now be too late to implement such resampling or adjustments in the current dataset. However, these sources of potential bias should be explicitly acknowledged and discussed as limitations in the manuscript.

A second methodological concern relates to the variation in sampling effort across the grid cells. The authors report establishing a total of 82.48 km of line transects across 75 grids, with an average transect length of 1.10 km per grid, ranging from 0.61 km to 2.56 km. This uneven effort across grids raises potential issues for both the intensity and presence-absence models, as the probability of detecting burrows increases with sampling effort. Consequently, grids with longer transects may appear to have higher habitat utilization or higher presence rates simply due to increased detectability, not because of true differences in habitat preference or use. Without standardizing or statistically accounting for variation in sampling effort, the comparisons between grids may be biased, leading to inaccurate inferences about the environmental factors influencing habitat selection. In addition, there appears to be a discrepancy between the reported results and Figure 1. The text states that burrows were recorded in 16 grids, however, upon examining Figure 1, it seems that at least 20 grids are marked as containing burrows.

On the other hand, while the authors correctly highlight that the habitat utilization intensity model can better capture nonlinear relationships in the Discussion section, such as with profile curvature, it is also important to acknowledge the advantages of presence-absence models, which are not discussed in the manuscript. Presence-absence models may be less sensitive to issues of spatial or individual-level dependence, especially in cases where multiple burrows within a single transect could have been created by the same individual. This dependency structure can result in overrepresentation of certain individuals in the intensity data, or higher detectability, potentially skewing model results and leading to erroneous ecological inferences. Presence-absence data, though coarser, can help mitigate this issue by reducing the weight of clustered observations from a single animal. A more balanced discussion of the strengths and limitations of both modelling approaches would strengthen the manuscript.

Moreover, the study reports different sets of key predictors for each modelling approach, a discrepancy that deserves deeper analysis. While the habitat utilization intensity model identified profile curvature and slope as primary factors, the presence-absence model emphasized distance to water and aspect. This divergence suggests that each model type may be capturing different ecological signals or responding differently to underlying data structures. Presence-absence models tend to be more conservative and less sensitive to outliers or overrepresentation, which might explain why distance to water, a factor biologically relevant for many mammals, emerged only in this model type. The fact that distance to water was not significant in the utilization intensity model raises important questions. Was this due to a lack of variability in that predictor within high-use areas? The manuscript would benefit from a more thorough exploration of why different variables emerged as important in each model, and what this implies for habitat selection and conservation planning for the Chinese pangolin.

The statement that the binary model "explains only 62.2% deviance" seems to unnecessarily downplay its performance. In fact, this is only marginally lower than the 65.2% reported for the intensity model, and the binary model also shows a higher pseudo R². This suggests that its explanatory power is not significantly weaker, and may even outperform the intensity model in some respects, depending on the evaluation criteria used. Therefore, framing the results in a way that suggests a clear superiority of one model over the other may be misleading.

In light of this, I would suggest reconsidering the structure and focus of the article. Rather than emphasizing the limitations of presence-absence models upfront, the discussion could first present and compare the habitat selection patterns revealed by both models. Then, a more balanced and nuanced discussion could follow, highlighting the respective advantages and limitations of each approach. Presence-absence models are widely used in ecology for valid reasons, including their robustness and interpretability, and should not be framed primarily in terms of what they fail to capture. A deeper, comparative analysis of why the models differ and what this implies for ecological inference would significantly strengthen the manuscript.

I would also question the suggestion in the conclusion that habitat utilization intensity data should be "prioritized" for in situ conservation. While this type of data can indeed provide finer-scale ecological insights, its collection is often resource-intensive, and its interpretation may be more sensitive to issues such as detectability, sampling effort, and spatial autocorrelation. In contrast, presence-absence data, while simpler, are more practical for large-scale monitoring and may offer more robust inferences in certain contexts, particularly when individual-level clustering may bias intensity measures. Rather than prioritizing one data type over the other, I would recommend emphasizing the complementary value of both. Each type captures different dimensions of species-habitat relationships, and their combined use can lead to a more holistic understanding.

Author Response

#24-July-2025

Dear anonymous reviewer,

Once again, we would like to thank the experts for reviewing our paper and giving us these valuable comments, which helped us to greatly improve the rigor of the paper.

Sincerely,

Haiyang Gao

Point to point responses are as follows:

# Reviewer 2

Q1: One of my primary concerns with the study lies in the data collection process, specifically the absence of any correction or adjustment for detection probability. The number of burrows per unit length of transect was used as a direct proxy for habitat utilization intensity. However, this approach assumes perfect detect ability (that is, that all existing burrows along the transects were observed and recorded). This assumption is rarely valid in field conditions. Factors such as observer bias, burrow visibility, and vegetation cover can all influence detection rates, potentially leading to underestimation or spatial bias in the recorded utilization intensity or presence-absence data. Without addressing detect ability (e.g., through occupancy models on repeated surveys), both the intensity and presence-absence models may reflect detection patterns rather than true ecological use. Alternatively, incorporating a distance-sampling framework, in which the distance from each detected burrow to the transect line is recorded, could help estimate the effective detection area. This would make it possible to model detection probability as a function of distance and adjust habitat use estimates accordingly. Then, by measuring environmental covariates within that effective sampling area (both in places with and without burrows) one could mitigate the potential bias introduced by imperfect detect ability. I understand that it may now be too late to implement such resampling or adjustments in the current data set. However, these sources of potential bias should be explicitly acknowledged and discussed as limitations in the manuscript.

Reply：Thank you for your valuable comments. The primary problem is about detecting probability. However, we have not assumed perfect detection probability, which as you point out could not be achieved in a field survey. However, we have taken control measures in our study design and analysis to avoid potential detection bias. First, all field surveys were conducted by Riqi Gao (the second author of this paper) and a stationary experienced guide (Jinzhen Yang or Jingxin Wang) to reduce detection bias. Second, we have included key environmental factors (e.g., slope, aspect, profile curvature, distance from water source, land use type, etc.) that may affect burrow visibility and detection probability into the fixed effects of the GAM models, which further reduces the interference of environmental heterogeneity on detection probability. In addition, the present study was based on static, persistent and recognizable Chinese pangolin burrows, which are different from the “presence-absence” characteristics of other animals.

We agree that there is some fluctuations in detection probability of the field surveys, and that moderate fluctuation is common and unavoidable in actual field surveys, which is a kind of random effect. However, after effective control of the confounding factors, such fluctuations fall within the acceptable and reasonable range of field ecological research, and will not substantially affect the core conclusions of this study. In conjunction with your comments, we have explained the limitations of the study in the discussion section. “A limitation of this study is that burrows were not fully counted potentially underestimating the intensity of habitat use by species, but the quantitative results of relative intensity use intensity remain reliable.”.

Q2: A second methodological concern relates to the variation in sampling effort across the grid cells. The authors report establishing a total of 82.48 km of line transects across 75 grids, with an average transect length of 1.10 km per grid, ranging from 0.61 km to 2.56 km. This uneven effort across grids raises potential issues for both the intensity and presence-absence models, as the probability of detecting burrows increases with sampling effort. Consequently, grids with longer transects may appear to have higher habitat utilization or higher presence rates simply due to increased detectability, not because of true differences in habitat preference or use. Without standardizing or statistically accounting for variation in sampling effort, the comparisons between grids may be biased, leading to inaccurate inferences about the environmental factors influencing habitat selection. In addition, there appears to be a discrepancy between the reported results and Figure 1. The text states that burrows were recorded in 16 grids, however, upon examining Figure 1, it seems that at least 20 grids are marked as containing burrows.

Reply：Thank you for pointing out the problem. In order to solve the problem of the different survey intensity of each grid, we introduced the length of line transects as an ‘offset’ term in the presence-absence model, so as to effectively reduce the influence of the length of line transects on the detection probability, and avoided the bias of the results. The R code is modified as follows：

“model <- gam(PA ~ s(Dis_water, k=3) + s(Dis_cropland, k=3) + s(Plane_curvature, k=3) + s(Profile_curvature, k=3)+ s(Aspect, k=3)+ s(Slope, k=3) + s(Altitude, k=3) + s(Shurubland, k=3) + s(Water, k=3) + offset(log(Line transect length)), family = binomial(), data = mydata)”

In addition, regarding the inconsistency between the description of Figure 1 and the illustration, we thank you for your careful review, and we have checked and corrected the text, because some grids with sample line lengths lower than 600m were discarded, but the discovered caves were still plotted in Figure 1, and we have deleted these burrow sites.

Q3：On the other hand, while the authors correctly highlight that the habitat utilization intensity model can better capture nonlinear relationships in the Discussion section, such as with profile curvature, it is also important to acknowledge the advantages of presence-absence models, which are not discussed in the manuscript. Presence-absence models may be less sensitive to issues of spatial or individual-level dependence, especially in cases where multiple burrows within a single transect could have been created by the same individual. This dependency structure can result in over representation of certain individuals in the intensity data, or higher detect ability, potentially skewing model results and leading to erroneous ecological inferences. Presence-absence data, though coarser, can help mitigate this issue by reducing the weight of clustered observations from a single animal. A more balanced discussion of the strengths and limitations of both modelling approaches would strengthen the manuscript.

Reply：We thank the reviewer for this valuable comments. We agree with you that presence-absence model has certain advantages in dealing with individual repetitiveness or spatial dependence. In the revised manuscript, we have compared the advantages and disadvantages between presence-absence data and utilization intensity data in the Discussion section, covering aspects, including nonlinear fitting ability, model explanatory power. For example, we pointed out that the presence-absence model is less sensitive is suitable for large-scale distribution prediction, whereas the use intensity model is more capable of capturing detailed changes in continuous ecological gradients and habitat use.

Meanwhile, combining the spatial scale and data characteristics of this study, we believe that individual repeatability has limited effects. Because the line transect survey covered a large spatial scale (>600m per grid), and the data were more focused on the local group rather than individuals. Therefore, we believe that the habitat utilization intensity model can more effectively reflect the habitat utilization intensity and ecological preference of Chinese pangolins under different environmental gradients. Thank you again for your careful review and valuable comments.

Q4: Moreover, the study reports different sets of key predictors for each modelling approach, a discrepancy that deserves deeper analysis. While the habitat utilization intensity model identified profile curvature and slope as primary factors, the presence-absence model emphasized distance to water and aspect. This divergence suggests that each model type may be capturing different ecological signals or responding differently to underlying data structures. Presence-absence models tend to be more conservative and less sensitive to outliers or overrepresentation, which might explain why distance to water, a factor biologically relevant for many mammals, emerged only in this model type. The fact that distance to water was not significant in the utilization intensity model raises important questions. Was this due to a lack of variability in that predictor within high-use areas? The manuscript would benefit from a more thorough exploration of why different variables emerged as important in each model, and what this implies for habitat selection and conservation planning for the Chinese pangolin.

Reply：We thank you for those valuable comments. Regarding the differences of key variables in the habitat characterization models based on the two data types, we analyzed the differences in ecological processes from “habitat selection” and “habitat use” in the revised manuscript. We concluded that the presence-absence model is more likely to describe the process of habitat selection, i.e., whether a species occurs in a certain habitat type, while the utilization intensity model is more inclined to capture the degree of habitat use intensity, i.e., the density burrows distributed in different habitat types.

In this study, the presence-absence model identified distance to water and slope as significant factors, which may reveal the response of the Chinese pangolin to water resource accessibility and underlying topographic features. On the other side, profile curvature and slope were identified using the intensity model, suggesting that there was a further preference for micro topographic features by the Chinese pangolin, reflecting specific selection characteristics. This difference is highly consistent with the sensitivity of the two type of data and the ecological processes.

Thus, we suggest that this difference in significant variables is a reflection of differences in ecological processes: the presence-absence model captures “whether to use”, while the intensity-of-use model captures “use intensity”. Thank you again for your careful review, which has helped us to further refine the interpretation depth of the two-step ecological processes.

Q5: The statement that the binary model "explains only 62.2% deviance" seems to unnecessarily downplay its performance. In fact, this is only marginally lower than the 65.2% reported for the intensity model, and the binary model also shows a higher pseudo R². This suggests that its explanatory power is not significantly weaker, and may even outperform the intensity model in some respects, depending on the evaluation criteria used. Therefore, framing the results in a way that suggests a clear superiority of one model over the other may be misleading.

Reply：We thank the reviewer for this valuable comments. We agreed and recognize that different data types may reveal different aspects of species' response to the environment variables, and that our presentation was too subjective, and we have revised these subjective statements in the full text.

Q6: In light of this, I would suggest reconsidering the structure and focus of the article. Rather than emphasizing the limitations of presence-absence models upfront, the discussion could first present and compare the habitat selection patterns revealed by both models. Then, a more balanced and nuanced discussion could follow, highlighting the respective advantages and limitations of each approach. Presence-absence models are widely used in ecology for valid reasons, including their robustness and interpretability, and should not be framed primarily in terms of what they fail to capture. A deeper, comparative analysis of why the models differ and what this implies for ecological inference would significantly strengthen the manuscript.

Reply：We thank you for the valuable comments. We reorganized the logic of the discussion section according to your suggestion. We first discussed the characteristics of habitat use from the process of habitat selection to habitat utilization. Second, we further compared the advantages and disadvantages of the two type models for fitting specific environmental variables. Finally, we explored the application of the two data types for measuring habitat characteristics from R², DE and AIC.

Q7: I would also question the suggestion in the conclusion that habitat utilization intensity data should be "prioritized" for in situ conservation. While this type of data can indeed provide finer-scale ecological insights, its collection is often resource-intensive, and its interpretation may be more sensitive to issues such as detectability, sampling effort, and spatial autocorrelation. In contrast, presence-absence data, while simpler, are more practical for large-scale monitoring and may offer more robust inferences in certain contexts, particularly when individual-level clustering may bias intensity measures. Rather than prioritizing one data type over the other, I would recommend emphasizing the complementary value of both. Each type captures different dimensions of species-habitat relationships, and their combined use can lead to a more holistic understanding.

Reply: We thank your for the attention to the practical conservation practice value of different data types. We fully agree that presence-absence data have important application values in large-scale field monitoring data, species habitat suitability assessment, and data-constrained conservation scenarios. The feasibility and advantage of presence-absence data at large scale scenarios is also clearly expressed in the discussion section.

However, for specialized mammal species with low densities and high crypticity, the selection of suitable data types is especially critical to reveal the habitat selection characteristics of the species. The Chinese pangolin is such a great sample, we also incurred a significant amount of manpower costs in this investigation which might be unrealistic for many other animal species. As a species that typically uses burrows as a core ecological feature, its burrow density reflects more information about habitat use intensity and preference, which can effectively support the identification of core utilization areas and the refined management of in situ conservation. Actually, presence-absence data are more susceptible to stochastic interference, and can only provide coarse-grained information, which is difficult to accurately reflect the intensity of spatial utilization pattern for the small Chinese pangolin populations. Therefore, we believe that habitat use intensity data should be prioritized for scientific assessment and decision-making of in situ conservation for the Chinese pangolin. Undoubtedly, presence-absence data play an important role in large scale and rapid habitat assessment, and we also retain a positive discussion of the complementarity of the two data types, thank you again for your valuable suggestions.

Author Response File: Author Response.docx

Round 2

Reviewer 2 Report

Comments and Suggestions for Authors

I appreciate the authors’ efforts to address the methodological concerns raised in the first round of review, and I find that the revisions. Both in the manuscript and the detailed replies have substantially improved the clarity and rigor of the work. While perfect detection cannot be assumed, I acknowledge the efforts made to minimize bias through consistent observers and the use of an offset term for transect length. The discussion now appropriately addresses the limitations, and I find the current explanation satisfactory given the nature of the data. In addition, the revised discussion presents a more balanced perspective on the strengths and limitations of both presence-absence and habitat utilization intensity models. In summary, I believe the authors have adequately addressed the concerns raised in the initial review. I support the publication of this manuscript in its revised form.

Article Menu

Comparison of Habitat Selection Models Between Habitat Utilization Intensity and Presence–Absence Data: A Case Study of the Chinese Pangolin

Further Information

Guidelines

MDPI Initiatives

Follow MDPI