Author Contributions
Conceptualisation, A.A.T., C.J.N., T.W. and E.B.; Data curation, A.A.T. and T.W.; Formal analysis, A.A.T.; Investigation, A.A.T.; Methodology, A.A.T. and S.P.; Project administration, C.J.N. and E.B.; Resources, T.W.; Software, S.P.; Supervision, C.J.N.; Validation, A.A.T., M.P.H., S.W. and E.B.; Visualisation, A.A.T.; Writing—original draft, A.A.T.; Writing—review and editing, C.J.N., M.P.H., S.W. and E.B. All authors have read and agreed to the published version of the manuscript.
Figure 1.
Left: Aerial view of the study site—Bass Rock (N 56
6′, W 2
36′). Image credit: UK Centre for Ecology and Hydrology. Right: Delimited areas used for counting during the previous decadal censuses. Image taken from Murray et al. (2014) [
9].
Figure 1.
Left: Aerial view of the study site—Bass Rock (N 56
6′, W 2
36′). Image credit: UK Centre for Ecology and Hydrology. Right: Delimited areas used for counting during the previous decadal censuses. Image taken from Murray et al. (2014) [
9].
Figure 2.
Basic workflow for a standard DL Convolutional Neural Network (CNN) training process. Image taken from Akçay et al. [
18].
Figure 2.
Basic workflow for a standard DL Convolutional Neural Network (CNN) training process. Image taken from Akçay et al. [
18].
Figure 3.
Complete workflow for the research project, from data acquisition to final product.
Figure 3.
Complete workflow for the research project, from data acquisition to final product.
Figure 4.
Example of the tiling process with the 2022 dataset. The entire orthomosaic is split into individual tiles measuring 200 × 200 pixels using the ArcGIS ‘Split Raster’ function. Each 200 × 200 pixel tile (highlighted in red here) equates to 6.4 m on the ground.
Figure 4.
Example of the tiling process with the 2022 dataset. The entire orthomosaic is split into individual tiles measuring 200 × 200 pixels using the ArcGIS ‘Split Raster’ function. Each 200 × 200 pixel tile (highlighted in red here) equates to 6.4 m on the ground.
Figure 5.
Example of the tiling process with the 2023 dataset. The entire orthomosaic is split into individual tiles measuring 500 × 500 pixels using the ArcGIS ‘Split Raster’ function. Each 500 × 500 tile (highlighted in red here) equates to 6.8 m on the ground.
Figure 5.
Example of the tiling process with the 2023 dataset. The entire orthomosaic is split into individual tiles measuring 500 × 500 pixels using the ArcGIS ‘Split Raster’ function. Each 500 × 500 tile (highlighted in red here) equates to 6.8 m on the ground.
Figure 6.
Top Left and Bottom Left: 2022 image tile, 200 × 200 pixels; Top Right: 2023 image tile, 200 × 200 pixels; Bottom Right: 2023 image tile, 500 × 500 pixels.
Figure 6.
Top Left and Bottom Left: 2022 image tile, 200 × 200 pixels; Top Right: 2023 image tile, 200 × 200 pixels; Bottom Right: 2023 image tile, 500 × 500 pixels.
Figure 7.
Example of using the open source software VGG Image Annotator to classify gannets as either ‘Dead’ (e.g., box 3), ‘Alive’ (e.g., box 2), or ‘Flying’ (e.g., box 1). Nests and man-made structures are also visible.
Figure 7.
Example of using the open source software VGG Image Annotator to classify gannets as either ‘Dead’ (e.g., box 3), ‘Alive’ (e.g., box 2), or ‘Flying’ (e.g., box 1). Nests and man-made structures are also visible.
Figure 8.
200 × 200 pixel image tiles showing examples of the three classes of gannet; Left: live gannets on the ground appear elliptical in shape; Middle: dead gannets have their neck and wings splayed; Right: flying gannets appear larger with their wings evenly spread out.
Figure 8.
200 × 200 pixel image tiles showing examples of the three classes of gannet; Left: live gannets on the ground appear elliptical in shape; Middle: dead gannets have their neck and wings splayed; Right: flying gannets appear larger with their wings evenly spread out.
Figure 9.
200 × 200 pixel image tile showing example of ground-truth bounding boxes overlaid by predicted bounding boxes for each given class. Pink = ground truth, red = dead (class 1), blue = alive (class 2), yellow = flying (class 3). Any other background is class 0.
Figure 9.
200 × 200 pixel image tile showing example of ground-truth bounding boxes overlaid by predicted bounding boxes for each given class. Pink = ground truth, red = dead (class 1), blue = alive (class 2), yellow = flying (class 3). Any other background is class 0.
Figure 10.
200 × 200 pixel image tile showing examples of FP predictions made by the model that are rocks or other natural features, indicated by the yellow arrows.
Figure 10.
200 × 200 pixel image tile showing examples of FP predictions made by the model that are rocks or other natural features, indicated by the yellow arrows.
Figure 11.
200 × 200 pixel image tile showing examples of FP predictions made by the model that are real birds missed out of the ground truth, indicated by the yellow arrows. The arrows point to red/blue bounding boxes that do not have a pink ground-truth bounding box underneath.
Figure 11.
200 × 200 pixel image tile showing examples of FP predictions made by the model that are real birds missed out of the ground truth, indicated by the yellow arrows. The arrows point to red/blue bounding boxes that do not have a pink ground-truth bounding box underneath.
Figure 12.
200 × 200 pixel image tile showing examples of FN predictions made by the model, indicated by the yellow arrows. The arrows point to pink ground-truth bounding boxes that do not have an overlaying predicted red/blue bounding box.
Figure 12.
200 × 200 pixel image tile showing examples of FN predictions made by the model, indicated by the yellow arrows. The arrows point to pink ground-truth bounding boxes that do not have an overlaying predicted red/blue bounding box.
Figure 13.
Full orthomosaic generated from the 2022 RGB imagery.
Figure 13.
Full orthomosaic generated from the 2022 RGB imagery.
Figure 14.
Full orthomosaic generated from the 2023 RGB imagery.
Figure 14.
Full orthomosaic generated from the 2023 RGB imagery.
Figure 15.
500 × 500 pixel image tile showing examples of live model predictions (blue) from the 2023 dataset. Gannets clearly detected against rocky background.
Figure 15.
500 × 500 pixel image tile showing examples of live model predictions (blue) from the 2023 dataset. Gannets clearly detected against rocky background.
Figure 16.
500 × 500 pixel image tile showing examples of potential TP dead (left, red) and FP dead model predictions (right, red) from the 2023 dataset.
Figure 16.
500 × 500 pixel image tile showing examples of potential TP dead (left, red) and FP dead model predictions (right, red) from the 2023 dataset.
Figure 17.
500 × 500 pixel image tile showing examples of flying model predictions (yellow) from the 2023 dataset. Two gannets in flight clearly highlighted in contrast to live gannets on the ground.
Figure 17.
500 × 500 pixel image tile showing examples of flying model predictions (yellow) from the 2023 dataset. Two gannets in flight clearly highlighted in contrast to live gannets on the ground.
Figure 18.
500 × 500 pixel image tiles showing examples of dead predictions made by the model (Left, red boxes), and the potential TP dead prediction (Right, red circle).
Figure 18.
500 × 500 pixel image tiles showing examples of dead predictions made by the model (Left, red boxes), and the potential TP dead prediction (Right, red circle).
Table 1.
Counts of Apparently Occupied Sites (AOSs) for northern gannets on Bass Rock as published by previous surveys: 2014, 2009 [
9], 2004, 1994, and 1985 [
11].
Table 1.
Counts of Apparently Occupied Sites (AOSs) for northern gannets on Bass Rock as published by previous surveys: 2014, 2009 [
9], 2004, 1994, and 1985 [
11].
Year | AOS Count |
---|
2014 | 75,259 |
2009 | 60,853 |
2004 | 48,098 |
1994 | 39,751 |
1985 | 21,589 |
Table 2.
DJI Matrice 300 RTK flight parameters for the 2022 and 2023 photogrammetry surveys of Bass Rock.
Table 2.
DJI Matrice 300 RTK flight parameters for the 2022 and 2023 photogrammetry surveys of Bass Rock.
Property | 2022 | 2023 |
---|
Camera | Zenmuse L1 | Zenmuse P1 |
Band | RGB | RGB |
Flight height | 100 m | 105 m |
Flight speed | 4 ms | 4 ms |
Total flight time | 18 min | 15 min |
GSD | 3.22 cm | 1.36 cm |
Side lap | 70% | 70% |
Forward overlap | 80% | 80% |
Tracks flown | E–W | S–N |
No. of missions | 3 | 2 |
Nadir images acquired | 102 | 135 |
Oblique images acquired | 76 | 15 |
Table 3.
Properties for the Zenmuse L1 and P1 cameras.
Table 3.
Properties for the Zenmuse L1 and P1 cameras.
Feature | 2022 | 2023 |
---|
Model | Zenmuse Z1 | Zenmuse P1 |
Band | RGB | RGB |
Resolution | 20 MP | 45 MP |
Image size (3:2) | 5472 × 3648 | 8192 × 5460 |
Physical focal length | 8.8 mm | 35 mm |
Full frame focal length | 24 mm | 35 mm |
Aperture | f/4 | f/4 |
ISO | Auto | Auto |
Shutter (priority mode) | 1/1000 | 1/1000 |
Exposure compensation | −0.3 | −0.7 |
Focus | N/A | Infinite |
Table 4.
Hyperparameters fine-tuned for use in the model.
Table 4.
Hyperparameters fine-tuned for use in the model.
Hyperparameter | Value | Description |
---|
Train-test split ratio | 0.1 | 10% of the labelled training data held back by the model for validation. 477 out of 530 used for training, 53 for validation. |
Batch size | 32 | Training dataset split into small batches of images so model can more efficiently calculate error and update weights accordingly. |
Learning rate | 0.001 | Step size determining how fast or slow model converges to optimal weights. Chosen rate is low enough to allow network to converge within a reasonable timescale [34]. |
Momentum | 0.9 | Prevents the optimisation process from becoming stuck in a local minimum and missing the global minimum as a result. Default value. |
Weight decay | 0.0005 | Factor applied after each update to prevent weights from growing too large and creating problems with over-fitting and model complexity. Default value. |
No. of epochs | 15 | The number of times that the learning algorithm will work through the entire training dataset and update weights accordingly [35]. Can vary each time model is run. |
Confidence threshold | 0.2 | A score above which a prediction is accepted or rejected by the model. Value chosen through trial and error. |
Table 5.
Definitions for object detection.
Table 5.
Definitions for object detection.
Detection | Acronym | Definition |
---|
True Positive | TP | Prediction made that matches ground truth |
False Positive | FP | Prediction made that does not match ground truth |
False Negative | FN | Prediction not made where ground truth exists |
True Negative | TN | Neither ground truth nor prediction exists (ignored for object detection) |
Table 6.
Result of 3 model runs on the 2022 dataset to assess reproducibility when seed values are set.
Table 6.
Result of 3 model runs on the 2022 dataset to assess reproducibility when seed values are set.
| Run 1 | Run 2 | Run 3 | Average |
---|
Epoch | 13 | 16 | 21 | - |
Validation data—Live | 582 | 560 | 552 | 565 |
Validation data—Dead | 182 | 189 | 188 | 186 |
Validation data—Flying | 38 | 37 | 37 | 37 |
mAP | 0.38 | 0.37 | 0.37 | 0.37 |
Final count—Live | 19,895 | 19,254 | 19,033 | 19,394 |
Final count—Dead | 3775 | 4284 | 4243 | 4100 |
Final count—Flying | 813 | 825 | 812 | 817 |
Table 7.
Validating the predicted counts for the 2022 dataset. Model Count = counts predicted by model; FP(−) = FP predictions that were not birds (e.g., rocks); TP = TP predictions; FP(+) = misclassified FP predictions (e.g., predicted dead but actually alive); FN = FN predictions; True Count = total of TP, FP(+), and FN; % Change = difference between Model Count and True Count.
Table 7.
Validating the predicted counts for the 2022 dataset. Model Count = counts predicted by model; FP(−) = FP predictions that were not birds (e.g., rocks); TP = TP predictions; FP(+) = misclassified FP predictions (e.g., predicted dead but actually alive); FN = FN predictions; True Count = total of TP, FP(+), and FN; % Change = difference between Model Count and True Count.
| Alive | Dead | Flying |
---|
Model Count | 552 | 188 | 37 |
FP (−) | 117 | 45 | 1 |
TP | 435 | 143 | 36 |
FP (+) | 70 | 12 | 2 |
FN | 25 | 10 | 4 |
True Count | 530 | 165 | 42 |
% Change | −3.99 | −12.23 | +13.51 |
Table 8.
Predicted counts for each class for the complete 2022 dataset.
Table 8.
Predicted counts for each class for the complete 2022 dataset.
Classification | Model Count | % Adjustment | Adjusted Count |
---|
Alive | 18,977 | −3.99 | 18,220 |
Dead | 4285 | −12.23 | 3761 |
Flying | 808 | +13.51 | 917 |
Table 9.
Validating the predicted counts for the 2022 dataset. Model Count = counts predicted by model; FP(−) = FP predictions that were not birds (e.g., rocks); TP = TP predictions; FP(+) = misclassified FP predictions (e.g., predicted dead but actually alive); FN = FN predictions; True Count = total of TP, FP(+), and FN; % Change = difference between Model Count and True Count.
Table 9.
Validating the predicted counts for the 2022 dataset. Model Count = counts predicted by model; FP(−) = FP predictions that were not birds (e.g., rocks); TP = TP predictions; FP(+) = misclassified FP predictions (e.g., predicted dead but actually alive); FN = FN predictions; True Count = total of TP, FP(+), and FN; % Change = difference between Model Count and True Count.
| Alive | Dead | Flying |
---|
Model Count | 5308 | 177 | 53 |
FP (−) | 103 | 174 | 1 |
TP | 5205 | 3 | 52 |
FP (+) | 122 | 2 | 18 |
FN | 510 | 0 | 6 |
True Count | 5837 | 5 | 76 |
% Change | +9.06 | −97.16 | +43.40 |
Table 10.
Predicted counts for each class for the complete 2023 dataset.
Table 10.
Predicted counts for each class for the complete 2023 dataset.
Classification | Model Count | % Adjustment | Adjusted Count |
---|
Alive | 44,433 | +9.06 | 48,455 |
Dead | 1510 | −97.16 | 43 |
Flying | 339 | +43.40 | 486 |
Table 11.
Description of how the DL model reduces time requirements.
Table 11.
Description of how the DL model reduces time requirements.
Action | Time (GPU) | Time (CPU) | Description |
---|
Training | 10 min | N/A | Training model on 477 labelled image tiles. Not recommended on CPU due to computational power required to run. |
Validation | <1 min | <1 min | Validating training data with 53 labelled image tiles. |
Implementation | <2 min | <25 min | Running trained model on entire 2022/2023 dataset. |