Analytical Assessment of Pre-Trained Prompt-Based Multimodal Deep Learning Models for UAV-Based Object Detection Supporting Environmental Crimes Monitoring
Abstract
1. Introduction
2. Materials and Methods
2.1. Description of the Test Area
2.2. Flight Plan, Data Acquisition, and Photogrammetric Processing
2.3. Description of the Models
- CLIPSeg [17] (released in LA: 10 December 2024; version used: 9 July 2025):
- GroundingDINO [19] (released in LA: 12 July 2024; version used: 9 July 2025):
- TextSAM [21] (released on 2 February 2024; version used: 7 August 2025):
- GroundingDINO: An open-set OD model that locates objects based on a previously described text prompt.
- Segment Anything Model (SAM) [22]: A segmentation model that performs pixel-wise classification into different categories. The workflow involves detecting objects using GroundingDINO and passing the obtained BBs to the SAM model, which refines the segmentation by highlighting relevant features. The final output is a precise polygon representing the segmented object(s).
2.4. Data Preparation for Deep Learning
2.5. Ground Truth Dataset Annotation and Metric Computation Procedure
2.6. DL Models Inference
2.7. Preliminary Elaborations
2.8. Frame Extraction from UAV-Acquired Videos
2.9. Generative AI
3. Results
3.1. Model Inference on the Original Orthoimages
3.1.1. The Role of “Padding”
3.1.2. The Role of “Tile_Size”
3.1.3. The Role of “Batch_Size”
3.1.4. Interaction Among Padding, Batch_Size, and Tile_Size
3.2. Model Inference on Frames Extracted from UAV-Acquired Videos
4. Discussion
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A




| Model | Parameters | Description |
|---|---|---|
| CLIPSeg | prompt | Text that describes the feature to be segmented. For visual prompting, an image URL or a local path can be provided. |
| threshold | The pixel classifications with probability values higher than this threshold are included in the result. The allowed values range from 0 to 1.0. This threshold value will be effective only when return_probability_raster is set to false. | |
| batch_size | Number of image tiles processed in each step of the model inference. This depends on the memory of your graphic card. | |
| return_probability_raster | If true, the output classified raster will be a continuous magnitude raster indicating the probability value at each pixel. | |
| padding | Number of pixels at the border of image tiles from which predictions are blended for adjacent tiles. Increase its value to smooth the output while reducing edge artifacts. The maximum value of the padding can be half of the tile size value. | |
| test_time_augmentation | Performs test time augmentation while predicting. If true, predictions of flipped and rotated variants of the input image will be merged into the final output. | |
| merge policy | Policy for merging predictions (mean, min, or max). Applicable when test_time_augmentation is true. | |
| TextSAM | text_prompt | Text that describes the objects to be detected. The input can be multiple text prompts, separated by commas, allowing the detection of multiple classes. |
| padding | Number of pixels at the border of image tiles from which predictions are blended for adjacent tiles. Increase its value to smooth the output while reducing edge artifacts. The maximum value of the padding can be half of the tile size value. | |
| batch_size | Number of image tiles processed in each step of the model inference. This depends on the memory of your graphics card. | |
| box_threshold | The confidence score used for selecting the detections to be included in the results. The allowed values range from 0 to 1.0. | |
| text_threshold | The confidence score used for associating the detected objects with the provided text prompt. A higher value ensures strong association but potentially fewer matches. The allowed values range from 0 to 1.0. | |
| tta_scales | Performs test time augmentation while predicting by changing the scale of the image. Values in the range of 0.5 to 1.5 are recommended. Multiple scale values separated by commas can also be provided, for example, 0.9, 1, 1.1. | |
| box_nms_thresh | The box IoU cut-off used by non-maximal suppression to filter duplicate masks. | |
| GroundingDINO | text_prompt | Text that describes the objects to be detected. The input can be multiple text prompts, separated by commas, allowing the detection of multiple classes. |
| padding | Number of pixels at the border of image tiles from which predictions are blended for adjacent tiles. Increase its value to smooth the output while reducing edge artifacts. The maximum value of the padding can be half of the tile size value. | |
| batch_size | Number of image tiles processed in each step of the model inference. This depends on the memory of your graphics card. | |
| box_threshold | The confidence score used for selecting the detections to be included in the results. The allowed values range from 0 to 1.0. | |
| text_threshold | The confidence score used for associating the detected objects with the provided text prompt. A higher value ensures strong association but potentially fewer matches. The allowed values range from 0 to 1.0. | |
| tta_scales | Performs test time augmentation while predicting by changing the scale of the image. Values in the range of 0.5 to 1.5 are recommended. Multiple scale values separated by commas can also be provided, for example, 0.9, 1, 1.1. | |
| nms_overlap | The maximum overlap ratio for two overlapping features, which is defined as the ratio of intersection area over union area. The default is 0.1 | |
| exclude_pad_detections | If true, filters potentially truncated detections near the edges that are in the padded region of image chips. |
| Inferences with Orthophoto “T1” | |||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Text Prompt: Tyres | Text Prompt: Barrels | ||||||||||||||
| Source.Name | Precision | Recall | F1_Score | AP | True_Positive | False_Positive | False_Negative | Source.Name | Precision | Recall | F1_Score | AP | True_Positive | False_Positive | False_Negative |
| T1_CLIPSEG07_tyres_default_.csv | 1.00 | 0.80 | 0.89 | 1.26 | 8 | 0 | 12 | T1_CLIPSEG07_barrels_default_.csv | 0.17 | 0.44 | 0.24 | 0.38 | 4 | 20 | 5 |
| T1_CLIPSEG07_tyres_p0_bs1_ts352.csv | 0.69 | 1.00 | 0.81 | 0.69 | 11 | 5 | 0 | T1_CLIPSEG07_barrels_p22_bs1_ts352.csv | 0.13 | 1.00 | 0.24 | 0.13 | 8 | 52 | 0 |
| T1_CLIPSEG07_tyres_p0_bs4_ts352.csv | 0.69 | 1.00 | 0.81 | 0.69 | 11 | 5 | 0 | T1_CLIPSEG07_barrels_p22_bs4_ts352.csv | 0.13 | 0.89 | 0.22 | 0.14 | 7 | 49 | 1 |
| T1_CLIPSEG07_tyres_p22_bs4_ts352.csv | 0.50 | 1.00 | 0.67 | 0.50 | 8 | 8 | 0 | T1_CLIPSEG07_barrels_p0_bs1_ts352.csv | 0.10 | 1.00 | 0.17 | 0.10 | 9 | 85 | 0 |
| T1_CLIPSEG07_tyres_p22_bs1_ts352.csv | 0.44 | 1.00 | 0.62 | 0.44 | 8 | 10 | 0 | T1_CLIPSEG07_barrels_p0_bs4_ts352.csv | 0.10 | 1.00 | 0.17 | 0.10 | 9 | 85 | 0 |
| T1_TSAM08_tyres_p37_bs1_ts600_NMS_dis.csv | 0.16 | 0.68 | 0.26 | 0.24 | 26 | 137 | 19 | T1_TSAM08_barrels_p11_bs1_ts180_NMS_dis.csv | 0.08 | 0.89 | 0.14 | 0.08 | 8 | 98 | 1 |
| T1_TSAM08_tyres_p11_bs1_ts180_NMS_dis.csv | 0.15 | 0.63 | 0.25 | 0.25 | 24 | 131 | 22 | T1_TSAM08_barrels_p0_bs1_ts1020_NMS_dis.csv | 0.07 | 0.89 | 0.14 | 0.08 | 8 | 101 | 1 |
| T1_TSAM08_tyres_p0_bs1_ts1020_NMS_dis.csv | 0.15 | 0.59 | 0.24 | 0.25 | 20 | 115 | 24 | T1_TSAM08_barrels_p0_bs1_ts1480_NMS_dis.csv | 0.07 | 0.89 | 0.14 | 0.08 | 8 | 101 | 1 |
| T1_TSAM08_tyres_p0_bs1_ts1480_NMS_dis.csv | 0.15 | 0.59 | 0.24 | 0.25 | 20 | 115 | 24 | T1_TSAM08_barrels_p0_bs1_ts180_NMS_dis.csv | 0.07 | 0.89 | 0.14 | 0.08 | 8 | 101 | 1 |
| T1_TSAM08_tyres_p0_bs1_ts180_NMS_dis.csv | 0.15 | 0.59 | 0.24 | 0.25 | 20 | 115 | 24 | T1_TSAM08_barrels_p0_bs1_ts600_NMS_dis.csv | 0.07 | 0.89 | 0.14 | 0.08 | 8 | 101 | 1 |
| T1_TSAM08_tyres_p0_bs1_ts600_NMS_dis.csv | 0.15 | 0.59 | 0.24 | 0.25 | 20 | 115 | 24 | T1_TSAM08_barrels_p37_bs1_ts600_NMS_dis.csv | 0.07 | 0.78 | 0.12 | 0.08 | 7 | 100 | 2 |
| T1_TSAM08_tyres_p0_bs4_ts1020_NMS_dis.csv | 0.14 | 0.66 | 0.23 | 0.21 | 25 | 152 | 20 | T1_TSAM08_barrels_p63_bs1_ts1020_NMS_dis.csv | 0.06 | 0.89 | 0.11 | 0.07 | 8 | 127 | 1 |
| T1_TSAM08_tyres_p0_bs4_ts1480_NMS_dis.csv | 0.14 | 0.66 | 0.23 | 0.21 | 25 | 152 | 20 | T1_GDINO07_barrels_p92_bs4_ts1480_NMS_dis.csv | 0.05 | 0.89 | 0.10 | 0.06 | 8 | 138 | 1 |
| T1_TSAM08_tyres_p0_bs4_ts180_NMS_dis.csv | 0.14 | 0.66 | 0.23 | 0.21 | 25 | 152 | 20 | T1_TSAM08_barrels_p11_bs4_ts180_NMS_dis.csv | 0.05 | 0.89 | 0.10 | 0.06 | 7 | 128 | 1 |
| T1_TSAM08_tyres_p0_bs4_ts600_NMS_dis.csv | 0.14 | 0.66 | 0.23 | 0.21 | 25 | 152 | 20 | T1_TSAM08_barrels_p0_bs4_ts1020_NMS_dis.csv | 0.05 | 1.00 | 0.10 | 0.05 | 8 | 148 | 0 |
| T1_TSAM08_tyres_p63_bs1_ts1020_NMS_dis.csv | 0.14 | 0.63 | 0.23 | 0.22 | 25 | 153 | 22 | T1_TSAM08_barrels_p0_bs4_ts1480_NMS_dis.csv | 0.05 | 1.00 | 0.10 | 0.05 | 8 | 148 | 0 |
| T1_TSAM08_tyres_default_NMS_dis.csv | 0.11 | 0.78 | 0.20 | 0.14 | 29 | 230 | 13 | T1_TSAM08_barrels_p0_bs4_ts180_NMS_dis.csv | 0.05 | 1.00 | 0.10 | 0.05 | 8 | 148 | 0 |
| T1_TSAM08_tyres_p92_bs1_ts1480_NMS_dis.csv | 0.10 | 0.54 | 0.17 | 0.19 | 21 | 188 | 27 | T1_TSAM08_barrels_p0_bs4_ts600_NMS_dis.csv | 0.05 | 1.00 | 0.10 | 0.05 | 8 | 148 | 0 |
| T1_GDINO07_tyres_p0_bs1_ts1020_NMS_dis.csv | 0.07 | 0.81 | 0.12 | 0.08 | 17 | 244 | 11 | T1_GDINO07_barrels_p0_bs4_ts1020_NMS_dis.csv | 0.05 | 1.00 | 0.10 | 0.05 | 8 | 148 | 0 |
| T1_GDINO07_tyres_p0_bs1_ts1480_NMS_dis.csv | 0.07 | 0.81 | 0.12 | 0.08 | 17 | 244 | 11 | T1_TSAM08_barrels_p92_bs1_ts1480_NMS_dis.csv | 0.05 | 0.78 | 0.09 | 0.06 | 7 | 137 | 2 |
| T1_GDINO07_tyres_p0_bs1_ts180_NMS_dis.csv | 0.07 | 0.81 | 0.12 | 0.08 | 17 | 244 | 11 | T1_GDINO07_barrels_p0_bs1_ts1020_NMS_dis.csv | 0.05 | 0.89 | 0.09 | 0.05 | 8 | 159 | 1 |
| T1_GDINO07_tyres_p0_bs1_ts600_NMS_dis.csv | 0.07 | 0.81 | 0.12 | 0.08 | 17 | 244 | 11 | T1_GDINO07_barrels_p0_bs1_ts1480_NMS_dis.csv | 0.05 | 0.89 | 0.09 | 0.05 | 8 | 159 | 1 |
| T1_GDINO07_tyres_p0_bs4_ts1020_NMS_dis.csv | 0.07 | 0.81 | 0.12 | 0.08 | 17 | 244 | 11 | T1_GDINO07_barrels_p0_bs1_ts180_NMS_dis.csv | 0.05 | 0.89 | 0.09 | 0.05 | 8 | 159 | 1 |
| T1_GDINO07_tyres_p0_bs4_ts1480_NMS_dis.csv | 0.07 | 0.81 | 0.12 | 0.08 | 17 | 244 | 11 | T1_GDINO07_barrels_p0_bs1_ts600_NMS_dis.csv | 0.05 | 0.89 | 0.09 | 0.05 | 8 | 159 | 1 |
| T1_GDINO07_tyres_p0_bs4_ts180_NMS_dis.csv | 0.07 | 0.81 | 0.12 | 0.08 | 17 | 244 | 11 | T1_GDINO07_barrels_p0_bs4_ts1480_NMS_dis.csv | 0.05 | 0.89 | 0.09 | 0.05 | 8 | 159 | 1 |
| T1_GDINO07_tyres_p0_bs4_ts600_NMS_dis.csv | 0.07 | 0.81 | 0.12 | 0.08 | 17 | 244 | 11 | T1_GDINO07_barrels_p0_bs4_ts180_NMS_dis.csv | 0.05 | 0.89 | 0.09 | 0.05 | 8 | 159 | 1 |
| T1_GDINO07_tyres_p37_bs4_ts600_NMS_dis.csv | 0.06 | 0.95 | 0.11 | 0.06 | 16 | 246 | 3 | T1_GDINO07_barrels_p0_bs4_ts600_NMS_dis.csv | 0.05 | 0.89 | 0.09 | 0.05 | 8 | 159 | 1 |
| T1_GDINO07_tyres_p63_bs4_ts1020_NMS_dis.csv | 0.06 | 0.86 | 0.11 | 0.07 | 15 | 238 | 8 | T1_GDINO07_barrels_p92_bs1_ts1480_NMS_dis.csv | 0.05 | 0.89 | 0.09 | 0.05 | 8 | 159 | 1 |
| T1_GDINO07_tyres_p63_bs1_ts1020_NMS_dis.csv | 0.06 | 0.86 | 0.11 | 0.06 | 15 | 253 | 8 | T1_TSAM08_barrels_default_NMS_dis.csv | 0.05 | 1.00 | 0.09 | 0.05 | 9 | 181 | 0 |
| T1_GDINO07_tyres_p11_bs1_ts180_NMS_dis.csv | 0.06 | 0.97 | 0.10 | 0.06 | 16 | 273 | 2 | T1_GDINO07_barrels_p11_bs1_ts180_NMS_dis.csv | 0.04 | 0.78 | 0.08 | 0.06 | 7 | 150 | 2 |
| T1_GDINO07_tyres_p11_bs4_ts180_NMS_dis.csv | 0.06 | 0.97 | 0.10 | 0.06 | 16 | 274 | 2 | T1_GDINO07_barrels_default_NMS_dis.csv | 0.04 | 0.89 | 0.08 | 0.05 | 8 | 175 | 1 |
| T1_GDINO07_tyres_p92_bs4_ts1480_NMS_dis.csv | 0.06 | 0.69 | 0.10 | 0.08 | 14 | 237 | 18 | T1_GDINO07_barrels_p11_bs4_ts180_NMS_dis.csv | 0.04 | 0.78 | 0.08 | 0.06 | 7 | 152 | 2 |
| T1_GDINO07_tyres_default_NMS_dis.csv | 0.05 | 0.86 | 0.10 | 0.06 | 12 | 212 | 8 | T1_GDINO07_barrels_p37_bs1_ts600_NMS_dis.csv | 0.04 | 0.78 | 0.08 | 0.05 | 7 | 170 | 2 |
| T1_TSAM08_tyres_p11_bs4_ts180_NMS_dis.csv | 0.05 | 0.95 | 0.09 | 0.05 | 8 | 159 | 3 | T1_GDINO07_barrels_p37_bs4_ts600_NMS_dis.csv | 0.04 | 0.67 | 0.07 | 0.05 | 6 | 160 | 3 |
| T1_GDINO07_tyres_p37_bs1_ts600_NMS_dis.csv | 0.05 | 0.69 | 0.09 | 0.07 | 13 | 258 | 18 | T1_GDINO07_barrels_p63_bs1_ts1020_NMS_dis.csv | 0.03 | 0.67 | 0.07 | 0.05 | 6 | 166 | 3 |
| T1_GDINO07_tyres_p92_bs1_ts1480_NMS_dis.csv | 0.05 | 0.68 | 0.09 | 0.07 | 13 | 269 | 19 | T1_TSAM08_barrels_p63_bs4_ts1020_NMS_dis.csv | 0.03 | 0.67 | 0.06 | 0.05 | 6 | 171 | 3 |
| T1_TSAM08_tyres_p63_bs4_ts1020_NMS_dis.csv | 0.03 | 0.71 | 0.06 | 0.05 | 7 | 204 | 17 | T1_TSAM08_barrels_p92_bs4_ts1480_NMS_dis.csv | 0.03 | 0.56 | 0.05 | 0.05 | 5 | 186 | 4 |
| T1_TSAM08_tyres_p37_bs4_ts600_NMS_dis.csv | 0.03 | 0.86 | 0.06 | 0.04 | 6 | 177 | 8 | T1_GDINO07_barrels_p63_bs4_ts1020_NMS_dis.csv | 0.03 | 0.44 | 0.05 | 0.06 | 4 | 155 | 5 |
| T1_TSAM08_tyres_p92_bs4_ts1480_NMS_dis.csv | 0.03 | 0.61 | 0.06 | 0.05 | 7 | 216 | 23 | T1_TSAM08_barrels_p37_bs4_ts600_NMS_dis.csv | 0.02 | 0.33 | 0.04 | 0.06 | 3 | 137 | 6 |
| Inferences with Orthophoto “T2” | |||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Text Prompt: Tyres | Text Prompt: Barrels | ||||||||||||||
| Source.Name | Precision | Recall | F1_Score | AP | True_Positive | False_Positive | False_Negative | Source.Name | Precision | Recall | F1_Score | AP | True_Positive | False_Positive | False_Negative |
| T2_CLIPSEG07_tyres_default_.csv | 0.93 | 0.68 | 0.79 | 1.36 | 13 | 1 | 19 | T2_CLIPSEG07_barrels_default_.csv | 0.18 | 0.67 | 0.29 | 0.27 | 6 | 27 | 3 |
| T2_CLIPSEG07_tyres_p0_bs1_ts352.csv | 0.56 | 0.90 | 0.69 | 0.62 | 15 | 12 | 6 | T2_CLIPSEG07_barrels_p22_bs4_ts352.csv | 0.12 | 1.00 | 0.22 | 0.12 | 9 | 65 | 0 |
| T2_CLIPSEG07_tyres_p0_bs4_ts352.csv | 0.56 | 0.90 | 0.69 | 0.62 | 15 | 12 | 6 | T2_CLIPSEG07_barrels_p22_bs1_ts352.csv | 0.12 | 1.00 | 0.21 | 0.12 | 9 | 67 | 0 |
| T2_CLIPSEG07_tyres_p22_bs1_ts352.csv | 0.55 | 0.85 | 0.67 | 0.65 | 11 | 9 | 9 | T2_CLIPSEG07_barrels_p0_bs1_ts352.csv | 0.08 | 0.89 | 0.15 | 0.09 | 8 | 88 | 1 |
| T2_CLIPSEG07_tyres_p22_bs4_ts352.csv | 0.55 | 0.85 | 0.67 | 0.65 | 11 | 9 | 9 | T2_CLIPSEG07_barrels_p0_bs4_ts352.csv | 0.08 | 0.89 | 0.15 | 0.09 | 8 | 88 | 1 |
| T2_TSAM08_tyres_p11_bs1_ts180_NMS_Dis.csv | 0.19 | 0.73 | 0.30 | 0.26 | 29 | 122 | 16 | T2_TSAM08_barrels_p0_bs1_ts1020_NMS_Dis.csv | 0.07 | 0.89 | 0.13 | 0.08 | 8 | 104 | 1 |
| T2_TSAM08_tyres_p63_bs1_ts1020_NMS_Dis.csv | 0.19 | 0.77 | 0.30 | 0.25 | 29 | 125 | 14 | T2_TSAM08_barrels_p0_bs1_ts1480_NMS_Dis.csv | 0.07 | 0.89 | 0.13 | 0.08 | 8 | 104 | 1 |
| T2_GDINO07_tyres_p63_bs4_ts1020_NMS_Dis.csv | 0.17 | 0.97 | 0.28 | 0.17 | 28 | 141 | 2 | T2_TSAM08_barrels_p0_bs1_ts180_NMS_Dis.csv | 0.07 | 0.89 | 0.13 | 0.08 | 8 | 104 | 1 |
| T2_TSAM08_tyres_p37_bs1_ts600_NMS_Dis.csv | 0.17 | 0.73 | 0.28 | 0.24 | 31 | 147 | 16 | T2_TSAM08_barrels_p0_bs1_ts600_NMS_Dis.csv | 0.07 | 0.89 | 0.13 | 0.08 | 8 | 104 | 1 |
| T2_GDINO07_tyres_p63_bs1_ts1020_NMS_Dis.csv | 0.16 | 0.98 | 0.28 | 0.16 | 27 | 140 | 1 | T2_GDINO07_barrels_p92_bs4_ts1480_NMS_Dis.csv | 0.07 | 1.00 | 0.13 | 0.07 | 9 | 120 | 0 |
| T2_GDINO07_tyres_p37_bs4_ts600_NMS_Dis.csv | 0.16 | 0.87 | 0.27 | 0.18 | 29 | 152 | 8 | T2_GDINO07_barrels_p0_bs1_ts1020_NMS_Dis.csv | 0.06 | 0.89 | 0.12 | 0.07 | 8 | 120 | 1 |
| T2_TSAM08_tyres_p0_bs4_ts1020_NMS_Dis.csv | 0.16 | 0.75 | 0.27 | 0.22 | 28 | 143 | 15 | T2_GDINO07_barrels_p0_bs1_ts1480_NMS_Dis.csv | 0.06 | 0.89 | 0.12 | 0.07 | 8 | 120 | 1 |
| T2_TSAM08_tyres_p0_bs4_ts1480_NMS_Dis.csv | 0.16 | 0.75 | 0.27 | 0.22 | 28 | 143 | 15 | T2_GDINO07_barrels_p0_bs1_ts180_NMS_Dis.csv | 0.06 | 0.89 | 0.12 | 0.07 | 8 | 120 | 1 |
| T2_TSAM08_tyres_p0_bs4_ts180_NMS_Dis.csv | 0.16 | 0.75 | 0.27 | 0.22 | 28 | 143 | 15 | T2_GDINO07_barrels_p0_bs1_ts600_NMS_Dis.csv | 0.06 | 0.89 | 0.12 | 0.07 | 8 | 120 | 1 |
| T2_TSAM08_tyres_p0_bs4_ts600_NMS_Dis.csv | 0.16 | 0.75 | 0.27 | 0.22 | 28 | 143 | 15 | T2_GDINO07_barrels_p0_bs4_ts1480_NMS_Dis.csv | 0.06 | 0.89 | 0.12 | 0.07 | 8 | 120 | 1 |
| T2_TSAM08_tyres_p0_bs1_ts1020_NMS_Dis.csv | 0.17 | 0.52 | 0.26 | 0.33 | 21 | 101 | 29 | T2_GDINO07_barrels_p0_bs4_ts180_NMS_Dis.csv | 0.06 | 0.89 | 0.12 | 0.07 | 8 | 120 | 1 |
| T2_TSAM08_tyres_p0_bs1_ts1480_NMS_Dis.csv | 0.17 | 0.52 | 0.26 | 0.33 | 21 | 101 | 29 | T2_GDINO07_barrels_p0_bs4_ts600_NMS_Dis.csv | 0.06 | 0.89 | 0.12 | 0.07 | 8 | 120 | 1 |
| T2_TSAM08_tyres_p0_bs1_ts180_NMS_Dis.csv | 0.17 | 0.52 | 0.26 | 0.33 | 21 | 101 | 29 | T2_GDINO07_barrels_p37_bs1_ts600_NMS_Dis.csv | 0.06 | 0.78 | 0.12 | 0.08 | 7 | 104 | 2 |
| T2_TSAM08_tyres_p0_bs1_ts600_NMS_Dis.csv | 0.17 | 0.52 | 0.26 | 0.33 | 21 | 101 | 29 | T2_TSAM08_barrels_p0_bs4_ts1020_NMS_Dis.csv | 0.06 | 1.00 | 0.11 | 0.06 | 9 | 143 | 0 |
| T2_GDINO07_tyres_p0_bs1_ts1020_NMS_Dis.csv | 0.15 | 0.88 | 0.25 | 0.17 | 24 | 140 | 7 | T2_TSAM08_barrels_p0_bs4_ts1480_NMS_Dis.csv | 0.06 | 1.00 | 0.11 | 0.06 | 9 | 143 | 0 |
| T2_GDINO07_tyres_p0_bs1_ts1480_NMS_Dis.csv | 0.15 | 0.88 | 0.25 | 0.17 | 24 | 140 | 7 | T2_TSAM08_barrels_p0_bs4_ts180_NMS_Dis.csv | 0.06 | 1.00 | 0.11 | 0.06 | 9 | 143 | 0 |
| T2_GDINO07_tyres_p0_bs1_ts180_NMS_Dis.csv | 0.15 | 0.88 | 0.25 | 0.17 | 24 | 140 | 7 | T2_TSAM08_barrels_p0_bs4_ts600_NMS_Dis.csv | 0.06 | 1.00 | 0.11 | 0.06 | 9 | 143 | 0 |
| T2_GDINO07_tyres_p0_bs1_ts600_NMS_Dis.csv | 0.15 | 0.88 | 0.25 | 0.17 | 24 | 140 | 7 | T2_GDINO07_barrels_p0_bs4_ts1020_NMS_Dis.csv | 0.06 | 1.00 | 0.11 | 0.06 | 9 | 143 | 0 |
| T2_GDINO07_tyres_p0_bs4_ts1020_NMS_Dis.csv | 0.15 | 0.88 | 0.25 | 0.17 | 24 | 140 | 7 | T2_TSAM08_barrels_p11_bs1_ts180_NMS_Dis.csv | 0.06 | 0.78 | 0.11 | 0.08 | 7 | 112 | 2 |
| T2_GDINO07_tyres_p0_bs4_ts1480_NMS_Dis.csv | 0.15 | 0.88 | 0.25 | 0.17 | 24 | 140 | 7 | T2_GDINO07_barrels_p92_bs1_ts1480_NMS_Dis.csv | 0.06 | 0.78 | 0.11 | 0.08 | 7 | 113 | 2 |
| T2_GDINO07_tyres_p0_bs4_ts180_NMS_Dis.csv | 0.15 | 0.88 | 0.25 | 0.17 | 24 | 140 | 7 | T2_GDINO07_barrels_p63_bs4_ts1020_NMS_Dis.csv | 0.05 | 0.89 | 0.10 | 0.06 | 8 | 139 | 1 |
| T2_GDINO07_tyres_p0_bs4_ts600_NMS_Dis.csv | 0.15 | 0.88 | 0.25 | 0.17 | 24 | 140 | 7 | T2_TSAM08_barrels_p37_bs1_ts600_NMS_Dis.csv | 0.06 | 0.67 | 0.10 | 0.08 | 6 | 102 | 3 |
| T2_TSAM08_tyres_p92_bs1_ts1480_NMS_Dis.csv | 0.14 | 0.80 | 0.25 | 0.18 | 30 | 177 | 12 | T2_GDINO07_barrels_p37_bs4_ts600_NMS_Dis.csv | 0.06 | 0.67 | 0.10 | 0.08 | 6 | 103 | 3 |
| T2_GDINO07_tyres_p11_bs4_ts180_NMS_Dis.csv | 0.14 | 0.87 | 0.25 | 0.16 | 28 | 168 | 8 | T2_GDINO07_barrels_p11_bs4_ts180_NMS_Dis.csv | 0.05 | 0.78 | 0.10 | 0.07 | 7 | 125 | 2 |
| T2_GDINO07_tyres_p11_bs1_ts180_NMS_Dis.csv | 0.14 | 0.87 | 0.24 | 0.16 | 28 | 169 | 8 | T2_GDINO07_barrels_p63_bs1_ts1020_NMS_Dis.csv | 0.05 | 0.78 | 0.09 | 0.06 | 7 | 133 | 2 |
| T2_GDINO07_tyres_p92_bs4_ts1480_NMS_Dis.csv | 0.14 | 0.77 | 0.23 | 0.18 | 25 | 156 | 14 | T2_TSAM08_barrels_default_NMS_Dis.csv | 0.05 | 1.00 | 0.09 | 0.05 | 9 | 178 | 0 |
| T2_GDINO07_tyres_default_NMS_Dis.csv | 0.12 | 1.00 | 0.22 | 0.12 | 31 | 226 | 0 | T2_TSAM08_barrels_p63_bs1_ts1020_NMS_Dis.csv | 0.05 | 0.78 | 0.09 | 0.06 | 7 | 137 | 2 |
| T2_TSAM08_tyres_default_NMS_Dis.csv | 0.12 | 0.82 | 0.21 | 0.15 | 31 | 227 | 11 | T2_GDINO07_barrels_p11_bs1_ts180_NMS_Dis.csv | 0.05 | 0.67 | 0.09 | 0.07 | 6 | 117 | 3 |
| T2_GDINO07_tyres_p37_bs1_ts600_NMS_Dis.csv | 0.11 | 0.57 | 0.18 | 0.19 | 20 | 161 | 26 | T2_TSAM08_barrels_p11_bs4_ts180_NMS_Dis.csv | 0.05 | 0.78 | 0.09 | 0.06 | 7 | 140 | 2 |
| T2_TSAM08_tyres_p11_bs4_ts180_NMS_Dis.csv | 0.09 | 0.83 | 0.17 | 0.11 | 16 | 156 | 10 | T2_GDINO07_barrels_default_NMS_Dis.csv | 0.05 | 0.89 | 0.09 | 0.05 | 8 | 166 | 1 |
| T2_TSAM08_tyres_p37_bs4_ts600_NMS_Dis.csv | 0.08 | 0.72 | 0.15 | 0.12 | 16 | 174 | 17 | T2_TSAM08_barrels_p92_bs1_ts1480_NMS_Dis.csv | 0.04 | 0.67 | 0.07 | 0.06 | 6 | 148 | 3 |
| T2_GDINO07_tyres_p92_bs1_ts1480_NMS_Dis.csv | 0.07 | 0.35 | 0.11 | 0.19 | 12 | 168 | 39 | T2_TSAM08_barrels_p63_bs4_ts1020_NMS_Dis.csv | 0.04 | 0.67 | 0.07 | 0.06 | 6 | 157 | 3 |
| T2_TSAM08_tyres_p92_bs4_ts1480_NMS_Dis.csv | 0.06 | 0.47 | 0.11 | 0.13 | 13 | 196 | 32 | T2_TSAM08_barrels_p92_bs4_ts1480_NMS_Dis.csv | 0.03 | 0.56 | 0.05 | 0.05 | 5 | 168 | 4 |
| T2_TSAM08_tyres_p63_bs4_ts1020_NMS_Dis.csv | 0.04 | 0.45 | 0.07 | 0.09 | 7 | 170 | 33 | T2_TSAM08_barrels_p37_bs4_ts600_NMS_Dis.csv | 0.01 | 0.22 | 0.03 | 0.07 | 2 | 136 | 7 |
| Inferences with Orthophoto “T3” | |||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Text Prompt: Tyres | Text Prompt: Barrels | ||||||||||||||
| Source.Name | Precision | Recall | F1_Score | AP | True_Positive | False_Positive | False_Negative | Source.Name | Precision | Recall | F1_Score | AP | True_Positive | False_Positive | False_Negative |
| T3_CLIPSEG07_tyres_p22_bs1_ts352.csv | 1.00 | 0.80 | 0.89 | 1.25 | 15 | 0 | 5 | T3_GDINO07_barrels_p21_bs1_ts340_NMS_Dis.csv | 0.23 | 0.75 | 0.35 | 0.31 | 3 | 10 | 1 |
| T3_CLIPSEG07_tyres_p22_bs4_ts352.csv | 1.00 | 0.80 | 0.89 | 1.25 | 15 | 0 | 5 | T3_GDINO07_barrels_p12_bs1_ts200_NMS_Dis.csv | 0.21 | 0.75 | 0.33 | 0.29 | 3 | 11 | 1 |
| T3_CLIPSEG07_tyres_p0_bs1_ts352.csv | 0.79 | 0.84 | 0.81 | 0.94 | 15 | 4 | 4 | T3_TSAM08_barrels_p0_bs1_ts200_NMS_Dis.csv | 0.20 | 0.75 | 0.32 | 0.27 | 3 | 12 | 1 |
| T3_CLIPSEG07_tyres_p0_bs4_ts352.csv | 0.79 | 0.84 | 0.81 | 0.94 | 15 | 4 | 4 | T3_TSAM08_barrels_p0_bs1_ts340_NMS_Dis.csv | 0.20 | 0.75 | 0.32 | 0.27 | 3 | 12 | 1 |
| T3_CLIPSEG07_tyres_default_.csv | 0.70 | 0.76 | 0.73 | 0.92 | 14 | 6 | 6 | T3_TSAM08_barrels_p0_bs1_ts495_NMS_Dis.csv | 0.20 | 0.75 | 0.32 | 0.27 | 3 | 12 | 1 |
| T3_GDINO07_tyres_p0_bs1_ts200_NMS_Dis.csv | 0.59 | 0.92 | 0.72 | 0.65 | 19 | 13 | 2 | T3_TSAM08_barrels_p0_bs1_ts60_NMS_Dis.csv | 0.20 | 0.75 | 0.32 | 0.27 | 3 | 12 | 1 |
| T3_GDINO07_tyres_p0_bs1_ts340_NMS_Dis.csv | 0.59 | 0.92 | 0.72 | 0.65 | 19 | 13 | 2 | T3_GDINO07_barrels_p21_bs4_ts340_NMS_Dis.csv | 0.19 | 0.75 | 0.30 | 0.25 | 3 | 13 | 1 |
| T3_GDINO07_tyres_p0_bs1_ts495_NMS_Dis.csv | 0.59 | 0.92 | 0.72 | 0.65 | 19 | 13 | 2 | T3_GDINO07_barrels_p0_bs1_ts200_NMS_Dis.csv | 0.18 | 0.75 | 0.29 | 0.24 | 2 | 9 | 1 |
| T3_GDINO07_tyres_p0_bs1_ts60_NMS_Dis.csv | 0.59 | 0.92 | 0.72 | 0.65 | 19 | 13 | 2 | T3_GDINO07_barrels_p0_bs1_ts340_NMS_Dis.csv | 0.18 | 0.75 | 0.29 | 0.24 | 2 | 9 | 1 |
| T3_GDINO07_tyres_p0_bs4_ts200_NMS_Dis.csv | 0.59 | 0.92 | 0.72 | 0.65 | 19 | 13 | 2 | T3_GDINO07_barrels_p0_bs1_ts495_NMS_Dis.csv | 0.18 | 0.75 | 0.29 | 0.24 | 2 | 9 | 1 |
| T3_GDINO07_tyres_p0_bs4_ts340_NMS_Dis.csv | 0.59 | 0.92 | 0.72 | 0.65 | 19 | 13 | 2 | T3_GDINO07_barrels_p0_bs1_ts60_NMS_Dis.csv | 0.18 | 0.75 | 0.29 | 0.24 | 2 | 9 | 1 |
| T3_GDINO07_tyres_p0_bs4_ts495_NMS_Dis.csv | 0.59 | 0.92 | 0.72 | 0.65 | 19 | 13 | 2 | T3_GDINO07_barrels_p0_bs4_ts200_NMS_Dis.csv | 0.18 | 0.75 | 0.29 | 0.24 | 2 | 9 | 1 |
| T3_GDINO07_tyres_p0_bs4_ts60_NMS_Dis.csv | 0.59 | 0.92 | 0.72 | 0.65 | 19 | 13 | 2 | T3_GDINO07_barrels_p0_bs4_ts495_NMS_Dis.csv | 0.18 | 0.75 | 0.29 | 0.24 | 2 | 9 | 1 |
| T3_TSAM08_tyres_p0_bs1_ts200_NMS_Dis.csv | 0.59 | 0.92 | 0.72 | 0.64 | 23 | 16 | 2 | T3_GDINO07_barrels_p0_bs4_ts60_NMS_Dis.csv | 0.18 | 0.75 | 0.29 | 0.24 | 2 | 9 | 1 |
| T3_TSAM08_tyres_p0_bs1_ts340_NMS_Dis.csv | 0.59 | 0.92 | 0.72 | 0.64 | 23 | 16 | 2 | T3_GDINO07_barrels_p12_bs4_ts200_NMS_Dis.csv | 0.18 | 0.75 | 0.29 | 0.24 | 3 | 14 | 1 |
| T3_TSAM08_tyres_p0_bs1_ts495_NMS_Dis.csv | 0.59 | 0.92 | 0.72 | 0.64 | 23 | 16 | 2 | T3_GDINO07_barrels_p3_bs1_ts60_NMS_Dis.csv | 0.18 | 0.75 | 0.29 | 0.24 | 3 | 14 | 1 |
| T3_TSAM08_tyres_p0_bs1_ts60_NMS_Dis.csv | 0.59 | 0.92 | 0.72 | 0.64 | 23 | 16 | 2 | T3_GDINO07_barrels_p3_bs4_ts60_NMS_Dis.csv | 0.18 | 0.75 | 0.29 | 0.24 | 3 | 14 | 1 |
| T3_GDINO07_tyres_p21_bs1_ts340_NMS_Dis.csv | 0.58 | 0.92 | 0.71 | 0.63 | 21 | 15 | 2 | T3_TSAM08_barrels_p3_bs1_ts60_NMS_Dis.csv | 0.17 | 0.75 | 0.27 | 0.22 | 3 | 15 | 1 |
| T3_GDINO07_tyres_p21_bs4_ts340_NMS_Dis.csv | 0.55 | 0.92 | 0.69 | 0.60 | 21 | 17 | 2 | T3_TSAM08_barrels_p0_bs4_ts200_NMS_Dis.csv | 0.16 | 0.75 | 0.26 | 0.21 | 3 | 16 | 1 |
| T3_GDINO07_tyres_p12_bs4_ts200_NMS_Dis.csv | 0.55 | 0.92 | 0.69 | 0.60 | 17 | 14 | 2 | T3_TSAM08_barrels_p0_bs4_ts340_NMS_Dis.csv | 0.16 | 0.75 | 0.26 | 0.21 | 3 | 16 | 1 |
| T3_TSAM08_tyres_p0_bs4_ts200_NMS_Dis.csv | 0.53 | 0.92 | 0.68 | 0.58 | 23 | 20 | 2 | T3_TSAM08_barrels_p0_bs4_ts495_NMS_Dis.csv | 0.16 | 0.75 | 0.26 | 0.21 | 3 | 16 | 1 |
| T3_TSAM08_tyres_p0_bs4_ts340_NMS_Dis.csv | 0.53 | 0.92 | 0.68 | 0.58 | 23 | 20 | 2 | T3_TSAM08_barrels_p0_bs4_ts60_NMS_Dis.csv | 0.16 | 0.75 | 0.26 | 0.21 | 3 | 16 | 1 |
| T3_TSAM08_tyres_p0_bs4_ts495_NMS_Dis.csv | 0.53 | 0.92 | 0.68 | 0.58 | 23 | 20 | 2 | T3_TSAM08_barrels_p30_bs1_ts495_NMS_Dis.csv | 0.16 | 0.75 | 0.26 | 0.21 | 3 | 16 | 1 |
| T3_TSAM08_tyres_p0_bs4_ts60_NMS_Dis.csv | 0.53 | 0.92 | 0.68 | 0.58 | 23 | 20 | 2 | T3_GDINO07_barrels_p0_bs4_ts340_NMS_Dis.csv | 0.16 | 0.75 | 0.26 | 0.21 | 3 | 16 | 1 |
| T3_GDINO07_tyres_p12_bs1_ts200_NMS_Dis.csv | 0.53 | 0.92 | 0.67 | 0.58 | 17 | 15 | 2 | T3_TSAM08_barrels_p12_bs1_ts200_NMS_Dis.csv | 0.14 | 0.75 | 0.23 | 0.18 | 3 | 19 | 1 |
| T3_GDINO07_tyres_default_NMS_Dis.csv | 0.50 | 0.92 | 0.65 | 0.54 | 13 | 13 | 2 | T3_TSAM08_barrels_p3_bs4_ts60_NMS_Dis.csv | 0.13 | 0.75 | 0.22 | 0.17 | 3 | 20 | 1 |
| T3_GDINO07_tyres_p3_bs1_ts60_NMS_Dis.csv | 0.50 | 0.92 | 0.65 | 0.54 | 19 | 19 | 2 | T3_GDINO07_barrels_p30_bs1_ts495_NMS_Dis.csv | 0.13 | 0.75 | 0.21 | 0.17 | 2 | 14 | 1 |
| T3_GDINO07_tyres_p3_bs4_ts60_NMS_Dis.csv | 0.50 | 0.92 | 0.65 | 0.54 | 19 | 19 | 2 | T3_GDINO07_barrels_p30_bs4_ts495_NMS_Dis.csv | 0.13 | 0.75 | 0.21 | 0.17 | 2 | 14 | 1 |
| T3_TSAM08_tyres_p3_bs4_ts60_NMS_Dis.csv | 0.49 | 0.92 | 0.64 | 0.53 | 22 | 23 | 2 | T3_TSAM08_barrels_p21_bs1_ts340_NMS_Dis.csv | 0.12 | 0.75 | 0.21 | 0.16 | 3 | 22 | 1 |
| T3_GDINO07_tyres_p30_bs1_ts495_NMS_Dis.csv | 0.50 | 0.88 | 0.64 | 0.57 | 19 | 19 | 3 | T3_GDINO07_barrels_default_NMS_Dis.csv | 0.11 | 0.75 | 0.18 | 0.14 | 2 | 17 | 1 |
| T3_GDINO07_tyres_p30_bs4_ts495_NMS_Dis.csv | 0.50 | 0.88 | 0.64 | 0.57 | 19 | 19 | 3 | T3_TSAM08_barrels_default_NMS_Dis.csv | 0.09 | 0.75 | 0.17 | 0.13 | 3 | 29 | 1 |
| T3_TSAM08_tyres_default_NMS_Dis.csv | 0.48 | 0.92 | 0.63 | 0.52 | 21 | 23 | 2 | T3_CLIPSEG07_barrels_p0_bs1_ts352.csv | 0.10 | 0.50 | 0.16 | 0.19 | 2 | 19 | 2 |
| T3_TSAM08_tyres_p12_bs1_ts200_NMS_Dis.csv | 0.47 | 0.92 | 0.62 | 0.51 | 22 | 25 | 2 | T3_CLIPSEG07_barrels_p0_bs4_ts352.csv | 0.10 | 0.50 | 0.16 | 0.19 | 2 | 19 | 2 |
| T3_TSAM08_tyres_p3_bs1_ts60_NMS_Dis.csv | 0.47 | 0.92 | 0.62 | 0.51 | 22 | 25 | 2 | T3_TSAM08_barrels_p21_bs4_ts340_NMS_Dis.csv | 0.09 | 0.50 | 0.15 | 0.18 | 2 | 20 | 2 |
| T3_TSAM08_tyres_p21_bs1_ts340_NMS_Dis.csv | 0.45 | 0.92 | 0.61 | 0.49 | 23 | 28 | 2 | T3_CLIPSEG07_barrels_p22_bs1_ts352.csv | 0.07 | 0.50 | 0.13 | 0.15 | 2 | 25 | 2 |
| T3_TSAM08_tyres_p30_bs1_ts495_NMS_Dis.csv | 0.46 | 0.88 | 0.60 | 0.52 | 22 | 26 | 3 | T3_CLIPSEG07_barrels_p22_bs4_ts352.csv | 0.07 | 0.50 | 0.13 | 0.15 | 2 | 25 | 2 |
| T3_TSAM08_tyres_p12_bs4_ts200_NMS_Dis.csv | 0.18 | 0.44 | 0.25 | 0.40 | 7 | 33 | 14 | T3_TSAM08_barrels_p30_bs4_ts495_NMS_Dis.csv | 0.07 | 0.75 | 0.13 | 0.09 | 2 | 27 | 1 |
| T3_TSAM08_tyres_p21_bs4_ts340_NMS_Dis.csv | 0.19 | 0.28 | 0.23 | 0.68 | 7 | 30 | 18 | T3_TSAM08_barrels_p12_bs4_ts200_NMS_Dis.csv | 0.03 | 0.25 | 0.06 | 0.14 | 1 | 28 | 3 |
| T3_TSAM08_tyres_p30_bs4_ts495_NMS_Dis.csv | 0.14 | 0.20 | 0.17 | 0.71 | 5 | 30 | 20 | T3_CLIPSEG07_barrels_default_.csv | 0.00 | 0.00 | 0.00 | 0.00 | 0 | 22 | 4 |
References
- Ichinose, D.; Yamamoto, M. On the Relationship between the Provision of Waste Management Service and Illegal Dumping. Resour. Energy Econ. 2011, 33, 79–93. [Google Scholar] [CrossRef]
- Porta, D.; Milani, S.; Lazzarino, A.I.; Perucci, C.A.; Forastiere, F. Systematic Review of Epidemiological Studies on Health Effects Associated with Management of Solid Waste. Environ. Health 2009, 8, 60. [Google Scholar] [CrossRef] [PubMed]
- Kjeldsen, P.; Barlaz, M.A.; Rooker, A.P.; Baun, A.; Ledin, A.; Christensen, T.H. Present and Long-Term Composition of MSW Landfill Leachate: A Review. Crit. Rev. Environ. Sci. Technol. 2002, 32, 297–336. [Google Scholar] [CrossRef]
- Bansal, K.; Tripathi, A.K. WasteNet: A Novel Multi-Scale Attention-Based U-Net Architecture for Waste Detection in UAV Images. Remote Sens. Appl. Soc. Environ. 2024, 35, 101220. [Google Scholar] [CrossRef]
- Berra, E.F.; Peppa, M.V. Advances and Challenges of UAV SFM MVS Photogrammetry and Remote Sensing: Short Review. In Proceedings of the 2020 IEEE Latin American GRSS & ISPRS Remote Sensing Conference (LAGIRS), Santiago, Chile, 22–26 March 2020; pp. 533–538. [Google Scholar]
- Diara, F.; Roggero, M. Quality Assessment of DJI Zenmuse L1 and P1 LiDAR and Photogrammetric Systems: Metric and Statistics Analysis with the Integration of Trimble SX10 Data. Geomatics 2022, 2, 254–281. [Google Scholar] [CrossRef]
- Colomina, I.; Molina, P. Unmanned Aerial Systems for Photogrammetry and Remote Sensing: A Review. ISPRS J. Photogramm. Remote Sens. 2014, 92, 79–97. [Google Scholar] [CrossRef]
- Mittal, P.; Singh, R.; Sharma, A. Deep Learning-Based Object Detection in Low-Altitude UAV Datasets: A Survey. Image Vis. Comput. 2020, 104, 104046. [Google Scholar] [CrossRef]
- Chen, K.; Wang, J.; Pang, J.; Cao, Y.; Xiong, Y.; Li, X.; Sun, S.; Feng, W.; Liu, Z.; Xu, J.; et al. MMDetection: Open MMLab Detection Toolbox and Benchmark. arXiv 2019, arXiv:1906.07155. [Google Scholar] [CrossRef]
- Yaseen, M. What Is YOLOv8: An In-Depth Exploration of the Internal Features of the Next-Generation Object Detector. arXiv 2024, arXiv:2409.07813. [Google Scholar]
- Caron, M.; Touvron, H.; Misra, I.; Jégou, H.; Mairal, J.; Bojanowski, P.; Joulin, A. Emerging Properties in Self-Supervised Vision Transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 10–17 October 2021. [Google Scholar]
- Sun, C.; Shrivastava, A.; Singh, S.; Gupta, A. Revisiting Unreasonable Effectiveness of Data in Deep Learning Era. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017. [Google Scholar]
- Eitel, A.; Springenberg, J.T.; Spinello, L.; Riedmiller, M.; Burgard, W. Multimodal Deep Learning for Robust RGB-D Object Recognition. In Proceedings of the 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Hamburg, Germany, 28 September–2 October 2015; pp. 681–687. [Google Scholar]
- Chen, K.; Jiang, X.; Wang, H.; Yan, C.; Gao, Y.; Tang, X.; Hu, Y.; Xie, W. OV-DAR: Open-Vocabulary Object Detection and Attributes Recognition. Int. J. Comput. Vis. 2024, 132, 5387–5409. [Google Scholar] [CrossRef]
- Li, S.; Cao, J.; Ye, P.; Ding, Y.; Tu, C.; Chen, T. ClipSAM: CLIP and SAM Collaboration for Zero-Shot Anomaly Segmentation. Neurocomputing 2025, 618, 129122. [Google Scholar] [CrossRef]
- Ren, T.; Jiang, Q.; Liu, S.; Zeng, Z.; Liu, W.; Gao, H.; Huang, H.; Ma, Z.; Jiang, X.; Chen, Y.; et al. Grounding DINO 1.5: Advance the “Edge” of Open-Set Object Detection. arXiv 2024, arXiv:2405.10300. [Google Scholar] [CrossRef]
- Lüddecke, T.; Ecker, A.S. Image Segmentation Using Text and Image Prompts. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022. [Google Scholar]
- Radford, A.; Kim, J.W.; Hallacy, C.; Ramesh, A.; Goh, G.; Agarwal, S.; Sastry, G.; Askell, A.; Mishkin, P.; Clark, J.; et al. Learning Transferable Visual Models from Natural Language Supervision. In Proceedings of the 38th International Conference on Machine Learning, PMLR, Virtual Event, 8–24 July 2021. [Google Scholar]
- Liu, S.; Zeng, Z.; Ren, T.; Li, F.; Zhang, H.; Yang, J.; Jiang, Q.; Li, C.; Yang, J.; Su, H.; et al. Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection. In Proceedings of the Computer Vision—ECCV 2024, Milan, Italy, 29 September–4 October 2024. [Google Scholar]
- Lin, T.-Y.; Maire, M.; Belongie, S.; Bourdev, L.; Girshick, R.; Hays, J.; Perona, P.; Ramanan, D.; Zitnick, C.L.; Dollár, P. Microsoft COCO: Common Objects in Context. In Proceedings of the Computer Vision—ECCV 2014, Zurich, Switzerland, 6–12 September 2014. [Google Scholar]
- Mumuni, F.; Mumuni, A. Segment Anything Model for Automated Image Data Annotation: Empirical Studies Using Text Prompts from Grounding DINO. arXiv 2024, arXiv:2406.19057. [Google Scholar] [CrossRef]
- Kirillov, A.; Mintun, E.; Ravi, N.; Mao, H.; Rolland, C.; Gustafson, L.; Xiao, T.; Whitehead, S.; Berg, A.C.; Lo, W.-Y.; et al. Segment Anything. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 1–6 October 2023; pp. 4015–4026. [Google Scholar]
- Spiegler, P.; Koleilat, T.; Harirpoush, A.; Miller, C.S.; Rivaz, H.; Kersten-Oertel, M.; Xiao, Y. TextSAM-EUS: Text Prompt Learning for SAM to Accurately Segment Pancreatic Tumor in Endoscopic Ultrasound. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, Honolulu, HI, USA, 19–23 October 2025. [Google Scholar]
- Goodwin, M.; Halvorsen, K.T.; Jiao, L.; Knausgård, K.M.; Martin, A.H.; Moyano, M.; Oomen, R.A.; Rasmussen, J.H.; Sørdalen, T.K.; Thorbjørnsen, S.H. Unlocking the Potential of Deep Learning for Marine Ecology: Overview, Applications, and Outlook. ICES J. Mar. Sci. 2022, 79, 319–336. [Google Scholar] [CrossRef]













Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Demartis, A.; Giulio Tonolo, F.; Barchi, F.; Zanella, S.; Acquaviva, A. Analytical Assessment of Pre-Trained Prompt-Based Multimodal Deep Learning Models for UAV-Based Object Detection Supporting Environmental Crimes Monitoring. Geomatics 2026, 6, 14. https://doi.org/10.3390/geomatics6010014
Demartis A, Giulio Tonolo F, Barchi F, Zanella S, Acquaviva A. Analytical Assessment of Pre-Trained Prompt-Based Multimodal Deep Learning Models for UAV-Based Object Detection Supporting Environmental Crimes Monitoring. Geomatics. 2026; 6(1):14. https://doi.org/10.3390/geomatics6010014
Chicago/Turabian StyleDemartis, Andrea, Fabio Giulio Tonolo, Francesco Barchi, Samuel Zanella, and Andrea Acquaviva. 2026. "Analytical Assessment of Pre-Trained Prompt-Based Multimodal Deep Learning Models for UAV-Based Object Detection Supporting Environmental Crimes Monitoring" Geomatics 6, no. 1: 14. https://doi.org/10.3390/geomatics6010014
APA StyleDemartis, A., Giulio Tonolo, F., Barchi, F., Zanella, S., & Acquaviva, A. (2026). Analytical Assessment of Pre-Trained Prompt-Based Multimodal Deep Learning Models for UAV-Based Object Detection Supporting Environmental Crimes Monitoring. Geomatics, 6(1), 14. https://doi.org/10.3390/geomatics6010014

