1. Introduction
Monitoring biodiversity and its potential threats are prerequisites for successful biodiversity conservation. Such monitoring allows for documenting threats facing biodiversity and enables the evaluation of the impact of threats to biodiversity and the assessment of conservation strategies [
1]. There are numerous and diverse methods to monitor biodiversity and the threats facing it. These have ranged from ground-truth data collection by field teams, to remotely sensed data collected from satellites, camera traps (CTs), autonomous acoustic recording units, and drones [
2,
3,
4,
5,
6,
7,
8,
9,
10]. However, the ease of use, endurance capabilities, and low-cost maintenance of camera traps, as well as the easy to interpret data that they provide, make this a prevalent technology for wildlife monitoring and associated threats, including poaching, human-wildlife conflict, and the effects of habitat degradation [
3,
7,
8,
9]. For example, the average CT battery life enables continuous target monitoring, and even at high-sensitivity rates with increased CT image captures, batteries can last, on average, two to four weeks [
10]. This is a considerable advantage compared to similar surveillance and monitoring tools, including UAVs with an average of 20–30 minutes flight time. Except for more advanced UAVs, including solar powered, reaching 25 days, this is not a widely utilised approach due to their current payload capacities and fragile components [
11]. In addition, the advancement in automated networked CTs for image communications allows for more efficient data collection and transmission [
9], without the need for full-time human operation as with UAV applications. Such advancements in CT image transmissions, coupled with long-term battery life, enable continuous, near-real-time data transmission. However, even with CT advancements, the vast number of images that often result from deployments pose a challenge to users. This hinders the application of CTs as a full near-real-time tool, as image analysts, who are usually experts but increasingly also citizen scientists (trained volunteers) [
12], must assess images manually to detect, identify, and count targets, which is a time-consuming process [
12,
13,
14,
15]. This prevents the effective application of CTs in time-sensitive situations, such as monitoring threats including poachers, as it is unrealistic to expect analysts to be on standby for image evaluation, regardless of image transmission platforms, e.g., satellites or GSM links [
16].
With the intention to increase CT image processing speeds for time-sensitive applications, some research groups have integrated machine learning (ML) with citizen scientists to automatically detect and identify humans/animals (targets) within CT images [
16,
17,
18]. Several groups have taken a further step towards automation and predominantly or solely used ML for such tasks [
19,
20,
21,
22,
23]. These studies have indicated overall detection accuracies ranging from 68–93% [
20,
22], illustrating the capacity for ML to detect targets within previously tested conditions.
Ideally, we would independently apply ML for object detection in CT images irrespective of potential influences on detection probability (DP) and correct classification (CC). However, many of these studies have used images of whole animals to reduce occlusion or used single-species images and simplified conditions, likely leading to increased accuracy biases. This includes Norouzzadeh et al. [
19] using single-species images, likely contributing to accuracies of 96.6% on a 3.2 million-image Snapshot Serengeti dataset from Tanzania. Additionally, Norouzzadeh et al. [
22] further investigated ML applications using the Snapshot Serengeti dataset to reduce overall manual processing and ML model training. This time they manually cropped and segmented images containing whole animal bodies to increase animal identification. This likely led to their ML performance matching state-of-the-art citizen scientist accuracies for the 3.2 million image dataset with only 14,100 manual classification labels, reducing manual labelling effort by over 99.5%. Similarly, Yu et al. [
21] cropped over 7000 camera trap images from Barro Colorado Island, Panama and Hoge Veluwe National Park, Netherlands. This resulted in an 82% accuracy of detection and classification of 18 mammalian species using ML methods.
Although in some cases, combining citizen scientist and ML methods may be beneficial, such methods have not been practiced for target detection and classification within CT images [
17,
18]. However, Willi et al. [
17] combined analyst and ML methods, using citizen scientists to pre-process images. They then tested CNN performance on empty image identification accuracy (91–98%) and species identification accuracies (88–92%) of all CT images, reducing analyst image processing time by 43%. Nevertheless, the need for researchers to simplify conditions or combine analyst and ML processing methods for sufficient accuracies underlines the importance of mitigating external influences on target DP and CC in CT images.
There is, however, increasing promise in ML methods as an independent approach to CT image processing, as studies focus on improving ML performance in comparison with analysts [
22]. An example of this includes Thangarasu and Kaliappan [
23], comparing machine-learning and deep-learning model accuracies for species identification and finding >95% accuracy when detecting and classifying 19 mammalian species. Nonetheless, such performance testing has not yet incorporated an extensive amount of external influential factors.
Certain factors’ influence, including effective detection distance and species size on the effect of CT trigger probability, are well documented [
24,
25,
26]. However, there is no published study focusing on their influence on the ability to detect targets within CT images, although distance and species size class have both been identified as reliable predictors of wildlife DP due to their influence on CT trigger probability [
25,
26]. The influence of wildlife occlusion on the DP of targets in CT images, particularly coupled with increasing distance, has been of interest. Willi et al. [
17], for example, found that increased distance from the CT increased the occlusion probability of the animal due to vegetation between the animal and the CT. Similarly, studies have found that occluded species-specific characteristics significantly influenced wildlife CC negatively. For example, Yu et al. [
21] found negative influences of occluded species-specific characteristics, including coloured patches, stripes, spots, and overall body composition, on the CC of ungulates. Moreover, Gomez Villa et al. [
16] found negative influences of partially occluded species-specific characteristics, including spots and antler shape, on the CC of red brocket and white-tailed deer species.
Some studies have highlighted differences in light illumination and their influence on wildlife DP and CC in CT images, including Yu et al. [
21] finding that dark lighting (dusk) presented a negative effect on DP. This also caused classification biases of diurnal species due to the appearance of dense foliage covering the CT, commonly leading to grayscale images. They also found biases of some cathemeral species frequently active during daylight. This is because, as expected, the authors did not train the ML algorithm to detect and classify diurnal species in grayscale images, and cathemeral species that are commonly active during daylight naturally lack grayscale images for model training. The biases caused such species to be commonly classified as cathemeral species, including
Crocuta crocuta. Nonetheless, this study still resulted in 88.9% accuracy, using deep regions with convolutional neural networks (R-CNNs) to detect and correctly classify species in CT images from the Snapshot Serengeti dataset.
No known study has focused on the influences of human-related factors and their effect on human DP and CC in CT images. That said, Hambrecht et al. [
2] used thermal infrared (TIR) and red, green, and blue (RGB) drone imagery. They found human occlusion due to canopy density, increasing distance from the centreline—the centre of an experiment created by the investigator, and analyst performance were the main influences of human DP within TIR drone imagery. The authors also assessed clothing colour contrast relative to the background and its influence on human detection, finding that red and blue were the most influential colours for detecting humans using RGB drone imagery.
Although many studies focus on varying factors and their influence on wildlife detection and classification in CT images, no study has focused explicitly on the potential influence of external factors on the DP and CC of wildlife and humans in CT images. Therefore, in this study, we assessed the potential influence of the following factors: species (characteristics), analyst performance, vegetation type, degree of occlusion, distance from the CT, time-of-day, human/wildlife (target) height (metres), colour contrasts of clothing, and orientation towards the CT on the DP and CC of targets within CT images (see
Figure 1 for a visual overview of the tested factors within CT images). Additionally, we investigated multi-factor influences on ML performance compared with analyst performance for true-positive detection and classifications. We collected the data during the standard operating of the cameras within western Tanzania. Afterwards, we collated the training and testing data to assess the influence of the tested factors on target DP and CC within CT images, using ML and analyst methods. We then compared the performance of these methods for true-positive detections and classifications.
Illustration of Camera Trap (CT) Image Scenarios Incorporating Tested Factors
Figure 1.
Examples illustrating different CT image scenarios, incorporating the factors measured in this study. Human images include (a) vegetation type (miombo woodland), time-of-day (daylight), distance (<5 m), the colour of clothing (green), orientation towards the CT (yes), height (1.9 m), and occlusion (68–100%); (b) Vegetation type (miombo woodland), time-of-day (dusk), distance (15–30 m), height (variations) and occlusion (68–100%); (c) Vegetation type (riverine forest), time-of-day (daylight), distance (10 m), the colour of clothing (green), orientation towards the CT (no), height (1.7 m), and occlusion (68–100%). Wildlife image examples include (d) Species (Philantomba monticola), vegetation type (miombo woodland), time-of-day (daylight), distance (0–4.9 m), orientation towards the CT (no), and occlusion (68–100%); (e) Species (Crocuta crocuta), vegetation type (miombo woodland), time-of-day (dusk), distance (5–9.9 m), orientation towards the CT (no), and occlusion (34–67%); (f) Species (Tragelaphus sylvaticus), vegetation type (riverine forest), time-of-day (daylight), distance (>10 m), orientation towards the CT (no), and occlusion (68–100%).
Figure 1.
Examples illustrating different CT image scenarios, incorporating the factors measured in this study. Human images include (a) vegetation type (miombo woodland), time-of-day (daylight), distance (<5 m), the colour of clothing (green), orientation towards the CT (yes), height (1.9 m), and occlusion (68–100%); (b) Vegetation type (miombo woodland), time-of-day (dusk), distance (15–30 m), height (variations) and occlusion (68–100%); (c) Vegetation type (riverine forest), time-of-day (daylight), distance (10 m), the colour of clothing (green), orientation towards the CT (no), height (1.7 m), and occlusion (68–100%). Wildlife image examples include (d) Species (Philantomba monticola), vegetation type (miombo woodland), time-of-day (daylight), distance (0–4.9 m), orientation towards the CT (no), and occlusion (68–100%); (e) Species (Crocuta crocuta), vegetation type (miombo woodland), time-of-day (dusk), distance (5–9.9 m), orientation towards the CT (no), and occlusion (34–67%); (f) Species (Tragelaphus sylvaticus), vegetation type (riverine forest), time-of-day (daylight), distance (>10 m), orientation towards the CT (no), and occlusion (68–100%).
We initially proposed an alternative hypothesis for each factor tested. This includes:
A human or animal’s increase in distance and occlusion simultaneously from the CT would negatively influence at least one experiment, and denser vegetation cover would further contribute to this influence for one or more models.
A darker (green) colour contrast in comparison to the background would negatively influence human DP.
Decreases in the size of humans or wildlife would decrease their probability of detection, respectively, and darker times of day (dusk) would negatively affect human or wildlife DP and CC.
Occluded or partially occluded characteristic traits for similar species would decrease wildlife CC.
DP and CC would increase with increased human or animal orientation towards the CT.
There would be a significant positive difference in analyst performance on target DP and CC.
Lastly, the ML method would present higher or equivalent true-positive detections and classifications than the analyst method for one or more experimental models.
To test our hypotheses and meet the study’s aims, we have structured this article as follows: in
Section 2, we provide a detailed description of the experimental setup and analytical processes.
Section 3 provides quantitative data results, presenting the degree of influence of the analysed factors and comparing the performance of both ML and analyst methods. This finally leads to
Section 4, which reviews the impact of the tested factors, concluding with several recommendations and discussing the potential of ML to detect and correctly classify targets within CT images irrespective of such influencing factors.
4. Discussion
This study aimed to identify factors influencing target DP and CC in CT images and develop recommendations to mitigate such influences. We achieved this by using both ML and human analyst approaches, alongside performance comparisons of ML and analysts for target DP and CC under variable conditions.
Given the results of the experiments, our theory is that specific factors do indeed significantly influence the detection and classification of targets using both ML and analysts. More specifically, factors of distance and occlusion, particularly when coupled with increased vegetation density, presented the most significant effect on DP and CC. There are various studies that have highlighted different influences on the detection or classification of targets within CT images using ML approaches [
16,
17,
19,
20,
21,
22,
23]. However, none have explicitly evaluated all factors within this paper and their interacting effects on DP and CC within natural experimental conditions. Given that users consider the following recommendations and exercise the correct setup and design methods, the knowledge provided should facilitate increased DPs and CCs of wildlife and humans within CT images using both ML and analyst approaches. Therefore, this paper contributes toward the successful application of ML methods as a tool for target DP and CC within CT images, for the time-sensitive monitoring of threats facing biodiversity, regardless of external influences.
As an equation, the main theory would be as follows:
where A is significantly reduced detection and classification of targets for most tested models and B is the most significant increase in DP and CC of targets.
4.1. Important Factors to Consider for Improving Target DP and CC
The reduction in effective detection and classification distance of targets in CT images with increased distances (10–30 m) is comparable with previous studies. This includes Norouzzadeh et al. [
22] who found reduced effective detection distance when assessing the DP of species with varying body mass on CT trigger probability. This is consistent with Findlay et al. [
43] who noted that close (1 m) and far passes towards the peripheral viewpoint of the CT decreased trigger probability. Similarly, Hambrecht et al. [
2] reported a significant influence of distance from the centreline on the DP of ‘poachers’ using drones and human analysts. Additionally, Marin et al. [
44] reported the complexities of ML detection and classification of ‘poachers’ in partially occluded images. Likewise, the impact of increased occlusion levels on detection and classification within most experimental conditions agrees with previous findings [
16,
21].
When combined with further distances and occlusion, closed vegetation (riverine forest) led to increased adverse effects of distance and occlusion on human DP in all daylight models. The closed vegetation presented a more significant negative influence on detecting and classifying humans in daylight CT images than within dusk CT images captured in open vegetation (miombo woodland) at the same distance and occlusion. Similarly, Bukombe et al. [
26] reported that although seasonal differences alone presented no influence, they adversely reduced the DP of ungulates in CT images when coupled with species size and increased distance. However, we only compared vegetation type differences for human daylight CT experiments. Therefore, we could not rule out denser vegetation as an influence on target DP and CC within dusk CT images. Similarly, vegetation type was not included in the wildlife experiments as only images containing riverine forest vegetation were used due to a lack of wildlife images captured in open woodland.
Three main factors contribute to the magnitude of non-significant effects on human CC and DP models: (1) the natural outlier conditions found at a 5 m distance for daylight DP models and 25–30 m distance for the ML-dusk-DP model, (2) the high probability that colour presents no influence on the CC of humans, and (3) the high true-positive detection and classification responses increase the likelihood that most factors presented a positive influence or no influence at all on DP or CC rates. This is because analysts presented no false-negative responses, and they reached 71.7% for daylight true-positive DPs, similar to ML with 76.5%.
Dusk experiments presented a significant adverse effect on wildlife DP and CC for cathemeral species. This strongly supports the conclusions of Gomez Villa et al. [
16] who found that greyscale CT images likely caused diurnal and cathemeral species classification biases towards commonly nocturnal species, including
C. crocuta. Similarly, dusk lighting conditions significantly affected the DP of humans, particularly at further distances (20–30 m). However, Hambrecht et al. [
2] found the time-of-day factor to be an insignificant predictor of human DP in drone imagery.
Most of the species’ classes influenced DP and CC, with all but
C. ascanius significantly negatively impacting wildlife CC in CT images using both ML and analyst approaches. This was particularly the case when correctly classifying species of similar characteristics, including
T. sylvaticus,
S. grimmia, and
P. monticola. These findings indicate that physical similarities, including the general morphological size and shape of
T. sylvaticus and
S. grimmia, contributed to the probability of correctly classifying them. Similarly, the female bushbuck class and both duiker species, coupled with occlusion led to reduced CCs, particularly for ML methods which may have been due to similar face shapes and markers of both species. This is consistent with previous findings [
16,
21].
Species size seemed to be a particularly influential characteristic for analysts and ML-DP. The reasoning for this conclusion is that all the species that significantly positively influenced DP were medium-large, except
H. cristata. However, we removed the size class factor from the analysis due to its high multicollinearity with the factor species. Therefore, although we found a significant correlation between the species and size class factors, we cannot be certain that size was an influencing variable. Yet, there is previous evidence of increasing size positively influencing the DP and CC of wildlife in CT images [
26]. Large sizes led to higher DP at greater distances than smaller sizes. Nevertheless, as Gomez Villa et al. [
16] reported, species characteristic similarities adversely affect species-specific recognition of similar ungulates when using ML methods. However, human height variance presented no influence on human detection and classification within CT images. Although, as previously stated, much has been studied on wildlife body mass size influences on DP and CC [
26], to the best of our knowledge, no study has focused on human height variances and their impact on human DP and CC within CT images.
The analyst performance factor significantly increased the probability of wildlife CCs. Controversially, Katrak-Adefowora et al. [
45] reported low levels of citizen scientist CCs (51.8%) in comparison to professional biologists (77.6%). Significant variation between analyst responses may have been partially due to training methods used prior to analyst-DP and CC testing.
Darker contrast colours (green) against the background negatively influenced the DP of humans in daylight CT images using the analyst method and CCs using the ML method as opposed to a lighter blue. This is consistent with Hambrecht et al. [
2] who found darker contrast colours (green) reduced DP of humans within aerial drone RBG imagery in comparison to lighter colours (red and blue). Additionally, poachers are frequently recorded wearing green and brown camouflage to mix with the background surroundings and reduce being seen [
6]. Although much is known about the use of camouflage to disguise oneself, little effort has been focused on the degree of influence colour contrasts have on DP and CC. This study strongly supports that green presents a significantly high degree of influence towards reducing human DP in CT images compared to lighter colours.
The targets orientating in the CTs direction positively influenced human and wildlife DP using the analyst approach and the DP of wildlife using the ML approach. Controversially, Hofmeester et al. [
46] found anterior or posterior poses relative to the CT reduced CT trigger probability due to passive infrared (PIR) sensor detection difficulties. However, their study focuses on the influence of target proportions on CT sensor triggers rather than potential influencing target features on target DP in CT images.
In conclusion, distance and occlusion illustrated the most significant influence on both DP and CC, particularly when integrated with dense vegetation for human experiments; vegetation type was not assessed for wildlife experiments. Wildlife CC was particularly influenced by partial or complete occlusion of species, likely caused by occluded visual characteristics. Darker contrast colours significantly reduced human DP. Targets orientating towards the CT illustrated a positive influence on target DP and CC. There was a significant positive variation between analyst performances. Human height presented no influence, and darker lighting conditions significantly negatively influenced human DP and all wildlife models. Additionally, although the analysts performed better overall for the wildlife models, the ML method outperformed analysts for the human experiments (see
Table 2 and
Table 3). This leads to concluding the proposed hypothesis’ (see
Table 4).
4.2. Recommendations to Mitigate Influential Factors
The influence of multiple factors on target detection and classification within CT images raises cause for concern if users do not consider these during CT study design and implementation. However, some precautions and actions could be applied, reducing, and potentially mitigating, such influences.
Many factors should be considered for wildlife monitoring, particularly the monitoring of illegal activity using CTs, including the CT’s structure and appearance (camouflage, robustness, etc.), its setup, and its software capabilities. However, we instead aim to focus on the following CT features, with the aim of directing users towards the most optimal CT type for effective near-real-time use: trigger speed, battery life, field of view, resolution quality, and network capabilities. Additionally, we overview CT distribution and abundance, ML model type and training, analyst training, and Random Subspace Methods (RSM) for managing target occlusions within CT images.
There are various commercial camera traps available, in addition, most conservation organisations are making strides in advancing CTs, including the PoacherCams Panthera CT (Panthera, New York, NY, USA) [
47] and the Trailguard AI camera by RESOLVE (Washington, DC, USA) [
48]. Therefore, simply recommending a CT Type will not be of much use due to their ever-evolving technology and the vast range of CTs. Instead, we aim to highlight the factors which we find important for effective near-real-time threat monitoring based on the observations within this study and previous research recommendations [
10,
49].
The suitability of CTs for various conservation studies is widely acknowledged [
50,
51,
52,
53]. However, we focus on CT features that are most important for optimum target detection and classification within CT images. One particularly important feature is trigger speed. This has had a significant influence on target detection rates and has been one of the most evolved aspects of CT features, with trigger speeds now reaching approximately 0.5 seconds [
49]. Reliable and long battery life are essential for effective threat monitoring. As previously mentioned most CTs on high sensitivity settings can last, on average, two to four weeks [
10]. Additionally, new CT advancements, including Trailguard AI [
48], have enhanced rechargeable batteries and can now reach up to 1.5 years on a single battery. More advanced CTs using network power tend to consume more battery, due to the WiFi SD cards drawing power, but advancements in rechargeable batteries and solar energy usage should see improvements with future models [
50]. A wide FOV is essential for optimal target detection and classification [
54]. The average CT FOV ranges between 40–60° wide and from 5 m up to 30 m depending on the camera height positioning and vegetation density. We also focus on optimal resolution quality, a novel yet important factor for increasing the detection and classification of targets in CT images. Most networked CTs are of high-resolution quality, as standard with 4K video resolution and an average of 30-megapixel images, we recommend such high quality, providing optimum clarity for target detection and classification probabilities. However, one of the most important factors to consider when recommending a CT for near-real-time threat monitoring is the CT’s network capabilities. Commonly, CTs with remote network capabilities rely on cellular connections or WiFi connectivity to send images. However, more recent developments are integrating AI into the CT system to send pre-processed near-real-time information from anywhere to end users [
51]. Whytock [
53] tested this using a commercial Bushnell Core 24MP Low Glow 119936C CT and customised open-source hardware. The main frame hardware consists of a smart-bridge controller with a custom circuit board, LoRa STM32L0 ultra-low-power microcontroller, and RockBLOCK satellite modem, connecting to a Raspberry Pi 4 Compute module. The system is designed to only consume power when necessary, so it only activates the Raspberry Pi to download and classify images using AI when it receives a message from LoRa (the microcontroller within the smart bridge). After classification of the images, the Raspberry Pi sends the results via satellite and powers down to save battery. This type of open-source method is a reasonably cost-effective option. Users and developers would need to understand the mechanics of integrating and altering open-source hardware, including utilising Raspberry Pi’s for scripting and ML connectivity. However, if users could overcome this issue, this type of CT approach would be highly effective for real-time threat monitoring.
CT distribution and abundance will vary depending on many conditions, including individual objectives, CT detection range, target range, funding availability, and environmental conditions. However, we simply provide guideline recommendations of CT placement methods for optimum resource use and increased detection probability.
Based on this paper’s findings we determine that CTs should ideally be placed within 30 m of one another throughout the intended site for optimum detection and classification probability of both humans and wildlife. However, this is not likely feasible for real-world applications. Furthermore, in most cases, this is not required when users apply methods to optimise CT placement. Therefore, we instead offer recommendations for CT placement and distribution methods to increase detection and classification probability. Opportunistic CT placement is the most common approach for long-term wildlife monitoring [
55]. The primary focus of CT sites would ideally be within the following landmark areas: feeding and drinking sites, game trails, downed logs, and other minor landmark points, all of which have been found to increase trigger probabilities [
55]. However, an alternative method for optimising CT placement methods would be to utilise the spatial monitoring and reporting tool (SMART) as an additional measure to complement optimal CT placement [
56]. This tool allows rangers and other users to collaborate and share data including recording wildlife tracking, illegal activity monitoring, poaching camps, traps, and patrol routes, along with current CT and alternative sensor placements. The data can be used to create maps and reports and perform analysis to assess threats effectively and plan for optimal monitoring within specific target areas to prioritise funding and staffing resources. With such implementations, CT sites could be situated at typical hotspots as mentioned above, with a grouping of CTs in each location (depending on the desired area covered) with 15–30 m spacing between them.
For ML applications, we do not go into extensive detail on ML performances and comparisons but instead focus on the most prevalent architectures for target detection and classification as recommended by [
57]. Hui [
57] compares the Faster R-CNN ResNet101 and Inception ResNet Version 2 (V2), the Region-Based Fully Convolutional Network (R-FCN) ResNet 101, and Single Shot Detector (SSD) with both MobileNet and Inception V2. The R-FCN and SSD architectures are faster overall, with the SSD on MobileNet presenting the highest mAP (19.3) for real-time processing. However, the Faster R-CNN using the Inception ResNet provides the highest accuracy at one FPS for all tested classes. The R-FCN architecture using the Residual Network presents a better balance between speed and accuracy. However, the Faster R-CNN with ResNet 101 can achieve a similar overall performance. Given the overall balance of speed and accuracy, we recommend the Faster R-CNN with ResNet 101 architecture based on performance testing [
57] and its tested application within our study.
The primary recommendation for increasing analyst performance is to apply on-demand resources commonly defined as “just-in-time” training. For example, Katrak-Adefowora et al. [
45], who utilised such methods by training 94 citizen scientists (analysts) on wildlife species with CT images, found increased detection rates from 51.8–81.9%. Moreover, after training, analysts reported better confidence in species classifications.
Applying a random subspace method (RSM)—a strategic learning method where the features of a target image are randomly sampled for ML training, has proven successful in managing partial human occlusions. Several studies have successfully detected humans within partially occluded still images, resulting in true-positive detection accuracies of 75.6% [
44,
58]. Such an application improves target detection performance within partially occluded images without compromising detection accuracy for non-occluded images [
44]. This method not only accounts for increased robustness to occlusion but, in turn, potentially reduces the influence of occluded species-specific characteristics on wildlife CC.
4.3. Comparison of ML and Analyst Methods for DP and CC Performance
In this study, the overall wildlife classification and detection rates using the ML approach were low compared to similar studies, including detection and classification estimates of up to 93% probability [
22]. Such low rates are potentially due to a low (1000 count) threshold of training images per class. Similar studies report dataset counts ranging from 22,000 [
16] to 189,000 [
19]. Moreover, Gomez Villa et al. [
16] report that model training influences wildlife DP. However, the ML approach demonstrated increased DPs within daylight and dusk human DP models. Moreover, for the dusk DP experiments, the ML approach performed better than the analyst approach. However, low analyst-CC results may be partially contributed to by analyst experience.