Evaluation Framework for Bruise Detection: Systematic ALS/White-Light Training and Skin-Tone Balancing with Deep Learning
Abstract
1. Introduction
1.1. Emerging Technologies in Bruise Detection
1.2. Object Detection Models in Bruise Analysis
1.3. Study Goal and Contributions
- (1)
- Dual-modality detection evaluation: A bruise detection and documentation model capable of handling both ambient white light and specialized narrow-band ALS forensic images with comparable levels of performance.
- (2)
- Illumination-composition diagnostics: Systematic W/ALS training-ratio sweeps quantify the influence of ALS exposure during training on detection sensitivity, precision, and localization stability. The analysis identifies regimes in which ALS-dominant training materially improves AP and regimes in which white-light inclusion degrades performance, thereby informing acquisition priorities for deployment.
- (3)
- Subgroup fairness and localization trade-off analysis: Fairness for detection tasks is operationalized by measuring failure-rate reduction, localization variance, and overprediction frequency across skin-tone strata under targeted balancing strategies. This approach exposes trade-offs between improved sensitivity for darker skin tones and increased spatial uncertainty, informing dataset curation and annotation guidelines.
- (4)
- Partitioning diagnostics via embedding-similarity and seen/unseen evaluation: Backbone-derived image embeddings and Euclidean-similarity quantiles are used to quantify the extent to which image-level partitioning inflates apparent generalization. Stratified analyses (seen → unseen and top → bottom similarity) with bootstrapped confidence intervals demonstrate substantive mAP declines on less-similar or unseen injuries.
2. Related Works
2.1. AI in Medical Imaging
2.2. Computer Vision in Other Forensics Disciplines
2.3. Other Computer Vision AI Applications
3. Methodology
3.1. Dataset and Preprocessing
3.1.1. Dataset Source and Description
3.1.2. Class Distribution and Bias Analysis
3.1.3. Data Augmentation
3.1.4. Data Partitioning
3.2. Object Detection Models
3.2.1. Rationale for Model Selection
3.2.2. Faster R-CNN
3.2.3. RetinaNet
3.2.4. FCOS
3.3. Experimental Design and Training Protocol
3.3.1. Experimental Objectives and Design
3.3.2. Training Implementation and Hyperparameters
3.3.3. Cross-Condition Generalization Strategies for Training
3.4. Evaluation Protocol
3.4.1. Core Performance Metrics
3.4.2. Detection (Confidence) Thresholding and IoU Variability Analysis
3.4.3. Image-Specific and Failure Analysis
3.4.4. Analysis of Performance Dependency on Data Partitioning
4. Results and Discussion
4.1. Analysis of Object Detection Models
4.1.1. Performance Evaluation of Faster R-CNN Variants
4.1.2. Analysis of Detection Performance and Reliability Across Models
4.2. Model Performance and Cross-Dataset Generalization
4.3. Influence of W/ALS Ratio on Model Training Performance
4.4. Effect of Confidence and IoU Threshold Variations
4.5. Impact of Dataset Balancing on Model Performance
4.5.1. Overall Model Performance
4.5.2. Skin Tone Fairness
4.6. Error Analysis and Performance Influences
4.6.1. Influence of Dataset Attributes on Detection Outcomes
4.6.2. Failure Cases and Model Limitations
4.7. Impact of Data Partitioning on Generalization
5. Limitations and Future Works
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- van Houten, M.E.; Vloet, L.C.M.; Pelgrim, T.; Reijnders, U.J.L.; Berben, S.A.A. Types, characteristics and anatomic location of physical signs in elder abuse: A systematic review. Eur. Geriatr. Med. 2022, 13, 53–85. [Google Scholar] [CrossRef]
- Pilling, M.L.; Vanezis, P.; Perrett, D.; Johnston, A. Visual assessment of the timing of bruising by forensic experts. J. Forensic Leg. Med. 2010, 17, 143–149. [Google Scholar] [CrossRef] [PubMed]
- Black, H.I.; Coupaud, S.; Daéid, N.N.; Riches, P.E. On the relationships between applied force, photography technique, and the quantification of bruise appearance. Forensic Sci. Int. 2019, 305, 109998. [Google Scholar] [CrossRef]
- Harris, C.; Alcock, A.; Trefan, L.; Nuttall, D.; Evans, S.T.; Maguire, S.; Kemp, A.M. Optimising the measurement of bruises in children across conventional and cross polarized images using segmentation analysis techniques in Image J, Photoshop and circle diameter measurements. J. Forensic Leg. Med. 2018, 54, 114–120. [Google Scholar] [CrossRef]
- Lecomte, M.M.J.; Holmes, T.; Kay, D.P.; Simons, J.L.; Vintiner, S.K. The use of photographs to record variation in bruising response in humans. Forensic Sci. Int. 2013, 231, 213–218. [Google Scholar] [CrossRef]
- Scafide, K.N.; Sharma, S.; Tripp, N.E.; Hayat, M.J. Bruise detection and visibility under alternate light during the first three days post-trauma. J. Forensic Leg. Med. 2020, 69, 101893. [Google Scholar] [CrossRef] [PubMed]
- Scafide, K.N.; Sheridan, D.J.; Downing, N.R.; Hayat, M.J. Detection of Inflicted Bruises by Alternate Light: Results of a Randomized Controlled Trial. J. Forensic Sci. 2020, 65, 1191–1198. [Google Scholar] [CrossRef]
- Olds, K.; Byard, R.W.; Winskog, C.; Langlois, N.E.I. Validation of alternate light sources for detection of bruises in non-embalmed and embalmed cadavers. Forensic Sci. Med. Pathol. 2017, 13, 28–33. [Google Scholar] [CrossRef] [PubMed]
- Tyr, A.; Heldring, N.; Zilg, B. Examining the use of alternative light sources in medico-legal assessments of blunt-force trauma: A systematic review. Int. J. Leg. Med. 2024, 138, 1925–1938. [Google Scholar] [CrossRef]
- Scafide, K.N.; Downing, N.R.; Kutahyalioglu, N.S.; Sebeh, Y.; Sheridan, D.J.; Hayat, M.J. Quantifying the Degree of Bruise Visibility Observed Under White Light and an Alternate Light Source. J. Forensic Nurs. 2021, 17, 24. [Google Scholar] [CrossRef]
- Downing, N.R.; Scafide, K.N.; Ali, Z.; Hayat, M.J. Visibility of inflicted bruises by alternate light: Results of a randomized controlled trial. J. Forensic Sci. 2024, 69, 880–887. [Google Scholar] [CrossRef]
- Olds, K.; Byard, R.W.; Winskog, C.; Langlois, N.E.I. Validation of ultraviolet, infrared, and narrow band light alternate light sources for detection of bruises in a pigskin model. Forensic Sci. Med. Pathol. 2016, 12, 435–443. [Google Scholar] [CrossRef]
- Martín, C.G.; Grau, G.M. Use of ImageJ as an image processing method for the assessment of post-surgical bruises. Ski. Res. Technol. 2021, 27, 655–667. [Google Scholar] [CrossRef]
- Salehi, A.W.; Khan, S.; Gupta, G.; Alabduallah, B.I.; Almjally, A.; Alsolai, H.; Siddiqui, T.; Mellit, A. A Study of CNN and Transfer Learning in Medical Imaging: Advantages, Challenges, Future Scope. Sustainability 2023, 15, 5930. [Google Scholar] [CrossRef]
- Oladipo, F.O.; Emeka, E.O.; Alayesanmi, F.S.; Abraham, E.M. The State of the Art in Machine Learning-Based Digital Forensics. Int. J. Comput. Artif. Intell. 2021, 2, 6–19. [Google Scholar] [CrossRef]
- Oladipo, F.; Ogbuju, E.; Alayesanmi, F.S.; Musa, A.E. The State of the Art in Machine Learning-Based Digital Forensics. Soc. Sci. Res. Netw. 2020. [Google Scholar] [CrossRef]
- Liu, X.; Kale, A.U.; Bruynseels, A.; Mahendiran, T.; Denniston, A.K.; Shamdas, M.; Faes, L.; Fu, D.J.; Moraes, G.; Kern, C.; et al. A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: A systematic review and meta-analysis. Lancet Digit. Health 2019, 1, e271–e297. [Google Scholar] [CrossRef] [PubMed]
- Young, A.T.; Xiong, M.; Pfau, J.; Keiser, M.J.; Wei, M.L. Artificial Intelligence in Dermatology: A Primer. J. Investig. Dermatol. 2020, 140, 1504–1512. [Google Scholar] [CrossRef] [PubMed]
- Saikia, S.; Fidalgo, E.; Alegre, E.; Fernández-Robles, L. Object Detection for Crime Scene Evidence Analysis Using Deep Learning. In Image Analysis and Processing—ICIAP 2017; Battiato, S., Gallo, G., Schettini, R., Stanco, F., Eds.; Springer International Publishing: Cham, Switzerland, 2017; pp. 14–24. [Google Scholar]
- Tirado, J.; Mauricio, D. Bruise dating using deep learning. J. Forensic Sci. 2021, 66, 336–346. [Google Scholar] [CrossRef]
- Elyan, E.; Vuttipittayamongkol, P.; Johnston, P.; Martin, K.; McPherson, K.; Moreno-García, C.F.; Jayne, C.; Sarker, M.M. Computer vision and machine learning for medical image analysis: Recent advances, challenges, and way forward. Artif. Intell. Surg. 2022, 2, 24–45. [Google Scholar] [CrossRef]
- Olveres, J.; González, G.; Torres, F.; Moreno-Tagle, J.C.; Carbajal-Degante, E.; Valencia-Rodríguez, A.; Méndez-Sánchez, N.; Escalante-Ramírez, B. What is new in computer vision and artificial intelligence in medical image analysis applications. Quant. Imaging Med. Surg. 2021, 11, 3830–3853. [Google Scholar] [CrossRef]
- Elmahdy, M.S.; Abdeldayem, S.S.; Yassine, I.A. Low quality dermal image classification using transfer learning. In 2017 IEEE EMBS International Conference on Biomedical & Health Informatics (BHI); IEEE: New York, NY, USA, 2017; pp. 373–376. [Google Scholar] [CrossRef]
- Dildar, M.; Akram, S.; Irfan, M.; Khan, H.U.; Ramzan, M.; Mahmood, A.R.; Alsaiari, S.A.; Saeed, A.H.; Alraddadi, M.O.; Mahnashi, M.H. Skin Cancer Detection: A Review Using Deep Learning Techniques. Int. J. Environ. Res. Public Health 2021, 18, 5479. [Google Scholar] [CrossRef]
- Radhika, V.; Chandana, B.S. A Review: Early Detection, Segmentation and Classification Techniques for Melanoma and Skin Cancer in Images. In 2023 3rd International Conference on Smart Data Intelligence (ICSMDI); IEEE: New York, NY, USA, 2023; pp. 273–277. [Google Scholar]
- Torres-Velazquez, M.; Chen, W.-J.; Li, X.; McMillan, A.B. Application and Construction of Deep Learning Networks in Medical Imaging. IEEE Trans. Radiat. Plasma Med. Sci. 2021, 5, 137–159. [Google Scholar] [CrossRef]
- Dobay, A.; Ford, J.; Decker, S.; Ampanozi, G.; Franckenberg, S.; Affolter, R.; Sieberth, T.; Ebert, L.C. Potential use of deep learning techniques for postmortem imaging. Forensic Sci. Med. Pathol. 2020, 16, 671–679. [Google Scholar] [CrossRef]
- Garland, J.; Hu, M.; Duffy, M.; Kesha, K.; Glenn, C.; Morrow, P.; Stables, S.; Ondruschka, B.; Da Broi, U.; Tse, R.D. Classifying Microscopic Acute and Old Myocardial Infarction Using Convolutional Neural Networks. Am. J. Forensic Med. Pathol. 2021, 42, 230–234. [Google Scholar] [CrossRef] [PubMed]
- Oura, P.; Junno, A.; Junno, J.-A. Deep learning in forensic gunshot wound interpretation—A proof-of-concept study. Int. J. Leg. Med. 2021, 135, 2101–2106. [Google Scholar] [CrossRef] [PubMed]
- Bhargava, A.; Bansal, A. Fruits and vegetables quality evaluation using computer vision: A review. J. King Saud Univ. Comput. Inf. Sci. 2021, 33, 243–257. [Google Scholar] [CrossRef]
- Mei, M.; Li, J. An overview on optical non-destructive detection of bruises in fruit: Technology, method, application, challenge and trend. Comput. Electron. Agric. 2023, 213, 108195. [Google Scholar] [CrossRef]
- Hameed, K.; Chai, D.; Rassau, A. A comprehensive review of fruit and vegetable classification techniques. Image Vis. Comput. 2018, 80, 24–44. [Google Scholar] [CrossRef]
- Mohammadi, S.; Karganroudi, S.S.; Rahmanian, V. Advancements in Smart Nondestructive Evaluation of Industrial Machines: A Comprehensive Review of Computer Vision and AI Techniques for Infrastructure Maintenance. Machines 2025, 13, 11. [Google Scholar] [CrossRef]
- Tafida, A.I.; Zawawi, N.A.B.W.; Alaloul, W.S.; Musarat, M.A. A Systematic Review of the Use of Computer Vision and Photogrammetry Tools in Learning-Based Dimensional Road Pavement Defect Detection for Smart Transportation. In 2024 International Conference on Smart Applications, Communications and Networking (SmartNets); IEEE: New York, NY, USA, 2024; pp. 1–9. [Google Scholar]
- Safyari, Y.; Mahdianpari, M.; Shiri, H. A Review of Vision-Based Pothole Detection Methods Using Computer Vision and Machine Learning. Sensors 2024, 24, 5652. [Google Scholar] [CrossRef] [PubMed]
- Aminfar, K.; Scafide, K.; Wojtusiak, J.; Lattanzi, D. From structure health monitoring to forensics: Adapting computer vision to support victims of violence. In Health Monitoring of Structural and Biological Systems XVIII; SPIE: Bellingham, WA, USA, 2024; pp. 554–560. [Google Scholar] [CrossRef]
- Labelbox. 2025. Available online: https://labelbox.com/ (accessed on 24 February 2025).
- Chai, J.; Zeng, H.; Li, A.; Ngai, E.W.T. Deep learning in computer vision: A critical review of emerging techniques and application scenarios. Mach. Learn. Appl. 2021, 6, 100134. [Google Scholar] [CrossRef]
- Ansari, S. Building Computer Vision Applications Using Artificial Neural Networks: With Step-by-Step Examples in OpenCV and TensorFlow with Python; Apress: Berkeley, CA, USA, 2020. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. In Advances in Neural Information Processing Systems; Curran Associates, Inc.: Red Hook, NY, USA, 2015; Available online: https://proceedings.neurips.cc/paper_files/paper/2015/hash/14bfa6bb14875e45bba028a21ed38046-Abstract.html (accessed on 27 February 2024).
- Turay, T.; Vladimirova, T. Toward Performing Image Classification and Object Detection with Convolutional Neural Networks in Autonomous Driving Systems: A Survey. IEEE Access 2022, 10, 14076–14119. [Google Scholar] [CrossRef]
- Li, Y.; Xie, S.; Chen, X.; Dollar, P.; He, K.; Girshick, R. Benchmarking Detection Transfer Learning with Vision Transformers. arXiv 2021, arXiv:2111.11429. [Google Scholar] [CrossRef]
- Howard, A.; Sandler, M.; Chu, G.; Chen, L.C.; Chen, B.; Tan, M.; Wang, W.; Zhu, Y.; Pang, R.; Vasudevan, V.; et al. Searching for MobileNetV3. arXiv 2019, arXiv:1905.02244. [Google Scholar] [CrossRef]
- Huang, J.; Rathod, V.; Sun, C.; Zhu, M.; Korattikara, A.; Fathi, A.; Fischer, I.; Wojna, Z.; Song, Y.; Guadarrama, S.; et al. Speed/accuracy trade-offs for modern convolutional object detectors. arXiv 2017, arXiv:1611.10012. [Google Scholar] [CrossRef]
- Qian, S.; Ning, C.; Hu, Y. MobileNetV3 for Image Classification. In 2021 IEEE 2nd International Conference on Big Data, Artificial Intelligence and Internet of Things Engineering (ICBAIE); IEEE: New York, NY, USA, 2021; pp. 490–497. [Google Scholar] [CrossRef]
- Lin, T.-Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal Loss for Dense Object Detection. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 42, 318–327. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); IEEE Computer Society: Los Alamitos, CA, USA, 2016; pp. 770–778. [Google Scholar] [CrossRef]
- Tian, Z.; Shen, C.; Chen, H.; He, T. FCOS: Fully Convolutional One-Stage Object Detection. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV); IEEE Computer Society: Los Alamitos, CA, USA, 2019; pp. 9626–9635. [Google Scholar] [CrossRef]
- Lin, H.; Parsi, A.; Mullins, D.; Horgan, J.; Ward, E.; Eising, C.; Denny, P.; Deegan, B.; Glavin, M.; Jones, E. A Study on Data Selection for Object Detection in Various Lighting Conditions for Autonomous Vehicles. J. Imaging 2024, 10, 153. [Google Scholar] [CrossRef]
- Wenkel, S.; Alhazmi, K.; Liiv, T.; Alrshoud, S.; Simon, M. Confidence Score: The Forgotten Dimension of Object Detection Performance Evaluation. Sensors 2021, 21, 4350. [Google Scholar] [CrossRef]
- Padilla, R.; Passos, W.L.; Dias, T.L.B.; Netto, S.L.; da Silva, E.A.B. A Comparative Analysis of Object Detection Metrics with a Companion Open-Source Toolkit. Electronics 2021, 10, 279. [Google Scholar] [CrossRef]
- Mittal, A.; Moorthy, A.K.; Bovik, A.C. No-Reference Image Quality Assessment in the Spatial Domain. IEEE Trans. Image Process. 2012, 21, 4695–4708. [Google Scholar] [CrossRef] [PubMed]
- Mittal, A.; Soundararajan, R.; Bovik, A.C. Making a ‘Completely Blind’ Image Quality Analyzer. IEEE Signal Process. Lett. 2013, 20, 209–212. [Google Scholar] [CrossRef]
- Kuramoto, A.; Hasegawa, M. Proposal and Quality Evaluation of Skin Imaging Using Smartphone and Ring Light with Polarizing Film. In 2023 International Technical Conference on Circuits/Systems, Computers, and Communications (ITC-CSCC); IEEE: New York, NY, USA, 2023; pp. 1–5. [Google Scholar] [CrossRef]
- Zhang, L.; Zhang, L.; Bovik, A.C. A Feature-Enriched Completely Blind Image Quality Evaluator. IEEE Trans. Image Process. 2015, 24, 2579–2591. [Google Scholar] [CrossRef]
- Zhang, W.; Zhai, G.; Wei, Y.; Yang, X.; Ma, K. Blind Image Quality Assessment via Vision-Language Correspondence: A Multitask Learning Perspective. arXiv 2023, arXiv:2303.14968. [Google Scholar] [CrossRef]
- Hu, L.-Y.; Huang, M.-W.; Ke, S.-W.; Tsai, C.-F. The distance function effect on k-nearest neighbor classification for medical datasets. SpringerPlus 2016, 5, 1304. [Google Scholar] [CrossRef] [PubMed]
- Gupta, D.; Loane, R.; Gayen, S.; Demner-Fushman, D. Medical Image Retrieval via Nearest Neighbor Search on Pre-trained Image Features. arXiv 2022, arXiv:2210.02401. [Google Scholar] [CrossRef]
- Scafide, K.N.; Downing, N.R.; Kutahyalioglu, N.S.; Sheridan, D.J.; Langlois, N.E.; Hayat, M.J. Predicting alternate light absorption in areas of trauma based on degree of skin pigmentation: Not all wavelengths are equal. Forensic Sci. Int. 2022, 339, 111410. [Google Scholar] [CrossRef]
- Li, X.; Li, L.; Jiang, Y.; Wang, H.; Qiao, X.; Feng, T.; Luo, H.; Zhao, Y. Vision-Language Models in medical image analysis: From simple fusion to general large models. Inf. Fusion 2025, 118, 102995. [Google Scholar] [CrossRef]
- Li, X.; Li, L.; Li, M.; Yan, P.; Feng, T.; Luo, H.; Zhao, Y.; Yin, S. Knowledge distillation and teacher–student learning in medical imaging: Comprehensive overview, pivotal role, and future directions. Med. Image Anal. 2026, 107, 103819. [Google Scholar] [CrossRef]
- Sun, Y.; Li, X.; Li, L.; Feng, T.; Zhao, Y.; Yin, S. PHH-FL: Perceptual Hashing Hypernetwork Personalized Federated Learning for Heterogeneous Medical Image Analysis Tasks. IEEE Internet Things J. 2026, 13, 8712–8724. [Google Scholar] [CrossRef]













| Alias | Description | Total Samples | Train Samples | Validation/Test Samples |
|---|---|---|---|---|
| W | Light: White Filter: None | 1046 | 732 | 157 |
| ALS | Light: 415 nm & 450 nm Filter: all (O Y) | 5406 | 3784 | 811 |
| ALS+W | - | 6452 | 4516 | 968 |
| Dataset Name | Metric | R | G | B |
|---|---|---|---|---|
| W | Mean | 0.505 | 0.356 | 0.300 |
| Std | 0.315 | 0.230 | 0.199 | |
| ALS (415 nm & 450 nm) | Mean | 0.393 | 0.314 | 0.017 |
| Std | 0.256 | 0.273 | 0.021 | |
| -- with Orange Filter | Mean | 0.453 | 0.227 | 0.006 |
| Std | 0.294 | 0.185 | 0.004 | |
| -- with Yellow Filter | Mean | 0.332 | 0.404 | 0.029 |
| Std | 0.191 | 0.317 | 0.025 | |
| Combined (W + ALS) | Mean | 0.411 | 0.321 | 0.063 |
| (415 nm & 450 nm) | Std | 0.269 | 0.267 | 0.133 |
| Operation | Parameters |
|---|---|
| Resize | (224, 224) |
| Flip | H: p = 0.5 V: p = 0.5 |
| Affine | Random crop: p = 0.5, ratio = 0.9 Rotate: p = 0.5 |
| noise | Gaussian: p = 0.5, mean = 10, std = 50 |
| normalize | mean = [0.485, 0.456, 0.406] std = [0.229, 0.224, 0.225] |
| Train on Right → Test on Bottom ↓ | W | ALS | ALS+W |
|---|---|---|---|
| W | ☑ | ||
| ALS | ☑ | ||
| ALS+W | ☑ | ☑ | ☑ |
| (Train Set) → (Test Set) | |||
|---|---|---|---|
| Faster R-CNN | FCOS | RetinaNet | |
| (W) → (W) | 0.332 (0.295) | 0.392 (0.673) | 0.335 (0.619) |
| (ALS) → (ALS) | 0.152 (0.340) | 0.584 (0.931) | 0.306 (0.715) |
| (ALS+W) → (ALS+W) | 0.174 (0.342) | 0.531 (0.924) | 0.310 (0.691) |
| (ALS+W) → (W) | 0.200 (0.343) | 0.443 (0.904) | 0.320 (0.710) |
| (ALS+W) → (ALS) | 0.180 (0.296) | 0.541 (0.805) | 0.315 (0.593) |
| (W Ratio)/(ALS Ratio) % | Avg of Values Across All IoUs and Confidence Threshold | |||||
|---|---|---|---|---|---|---|
| AP (mAP@0.5) | CUM TP | CUM FP | Precision | Recall | F1 Score | |
| 0/100 | 0.4965 (0.849) | 68 | 1804 | 0.3250 | 0.4402 | 0.3739 |
| 25/75 | 0.4724 (0.816) | 69 | 2514 | 0.3004 | 0.4443 | 0.3585 |
| 50/50 | 0.4573 (0.774) | 67 | 2088 | 0.2890 | 0.4307 | 0.3459 |
| 75/25 | 0.4065 (0.706) | 60 | 2284 | 0.3060 | 0.3846 | 0.3408 |
| 100/0 | 0.3645 (0.686) | 58 | 2582 | 0.2923 | 0.3769 | 0.3293 |
| Confidence Thresh | IoU Thresh | AP | CUM TP | CUM FP | Precision | Recall | F1 Score |
|---|---|---|---|---|---|---|---|
| 0.1 | 0.1 | 0.9563 | 156 | 8392 | 0.0182 | 1 | 0.0358 |
| 0.1 | 0.3 | 0.9491 | 156 | 8392 | 0.0182 | 1 | 0.0358 |
| 0.1 | 0.5 | 0.8486 | 156 | 8392 | 0.0182 | 1 | 0.0358 |
| 0.1 | 0.7 | 0.5204 | 111 | 8437 | 0.0129 | 0.7115 | 0.0255 |
| 0.1 | 0.9 | 0.0022 | 4 | 8544 | 0.0004 | 0.0256 | 0.0009 |
| 0.3 | 0.1 | 0.9552 | 155 | 491 | 0.2399 | 0.9935 | 0.3865 |
| 0.3 | 0.3 | 0.9480 | 155 | 491 | 0.2399 | 0.9935 | 0.3865 |
| 0.3 | 0.5 | 0.8429 | 149 | 497 | 0.2306 | 0.9551 | 0.3715 |
| 0.3 | 0.7 | 0.5187 | 108 | 538 | 0.1671 | 0.6923 | 0.2693 |
| 0.3 | 0.9 | 0.0022 | 4 | 642 | 0.0061 | 0.0256 | 0.0099 |
| 0.5 | 0.1 | 0.8652 | 137 | 19 | 0.8782 | 0.8782 | 0.8782 |
| 0.5 | 0.3 | 0.8592 | 136 | 20 | 0.8717 | 0.8717 | 0.8717 |
| 0.5 | 0.5 | 0.7483 | 125 | 31 | 0.8012 | 0.8012 | 0.8012 |
| 0.5 | 0.7 | 0.4811 | 93 | 63 | 0.5961 | 0.5961 | 0.5961 |
| 0.5 | 0.9 | 0.0022 | 4 | 152 | 0.0256 | 0.0256 | 0.0256 |
| 0.7 | 0.1 | 0.1089 | 17 | 0 | 1 | 0.1089 | 0.1965 |
| 0.7 | 0.3 | 0.1089 | 17 | 0 | 1 | 0.1089 | 0.1965 |
| 0.7 | 0.5 | 0.1089 | 17 | 0 | 1 | 0.1089 | 0.1965 |
| 0.7 | 0.7 | 0.1025 | 16 | 1 | 0.9411 | 0.1025 | 0.1849 |
| 0.7 | 0.9 | 0.0007 | 1 | 16 | 0.0588 | 0.0064 | 0.0115 |
| 0.9 | 0.1 | NA | |||||
| 0.9 | 0.3 | NA | |||||
| 0.9 | 0.5 | NA | |||||
| 0.9 | 0.7 | NA | |||||
| 0.9 | 0.9 | NA |
| Skin Color | Dataset Config | Prediction Status (%) | |||
|---|---|---|---|---|---|
| Failed | Overpredict | Perfect | Underpredict | ||
| Very Light | U | 8.48 | 2.42 | 87.27 | 1.82 |
| Sb | 3.31 | 10.74 | 82.64 | 3.31 | |
| LFb | 4.41 | 10.29 | 77.94 | 7.35 | |
| Light | U | 7.00 | 4.00 | 82.00 | 7.00 |
| Sb | 6.87 | 12.98 | 70.23 | 9.92 | |
| LFb | 7.41 | 10.19 | 70.37 | 12.04 | |
| Intermediate | U | 8.51 | 3.19 | 84.04 | 4.26 |
| Sb | 1.67 | 5.83 | 83.33 | 9.17 | |
| LFb | 6.76 | 10.81 | 70.27 | 12.16 | |
| Tan | U | 14.99 | 6.24 | 73.75 | 5.00 |
| Sb | 7.69 | 17.69 | 68.46 | 6.15 | |
| LFb | 12.09 | 4.40 | 76.92 | 6.59 | |
| Brown | U | 12.31 | 3.85 | 76.92 | 6.93 |
| Sb | 2.92 | 18.25 | 74.45 | 4.38 | |
| LFb | 4.62 | 16.92 | 69.23 | 9.23 | |
| Dark | U | 14.87 | 2.48 | 72.72 | 9.92 |
| Sb | 5.56 | 11.90 | 78.57 | 3.97 | |
| LFb | 13.56 | 10.17 | 64.41 | 11.86 | |
| Skin Color | U (n) | U Fail % (95% CI) | Sb (n) | Sb Fail % (95% CI) | p-Value | Significant |
|---|---|---|---|---|---|---|
| Very Light | 165 | 8.5% (5.1–13.7%) | 121 | 3.3% (1.3–8.2%) | 0.088 | no |
| Light | 200 | 7.0% (4.2–11.4%) | 131 | 6.9% (3.7–12.5%) | 1.000 | no |
| Intermediate | 188 | 8.5% (5.3–13.4%) | 120 | 1.7% (0.5–5.9%) | 0.012 | yes |
| Tan | 160 | 15.0% (10.3–21.4%) | 130 | 7.7% (4.2–13.6%) | 0.066 | no |
| Brown | 130 | 12.3% (7.7–19.1%) | 137 | 2.9% (1.1–7.3%) | 0.004 | yes |
| Dark | 121 | 14.9% (9.6–22.3%) | 126 | 5.6% (2.7–11.0%) | 0.019 | yes |
| Skin Color | U (n) | U Overp. % (95% CI) | Sb (n) | Sb Overp. % (95% CI) | p-Value | Significant |
|---|---|---|---|---|---|---|
| Very Light | 165 | 2.4% (0.9–6.1%) | 121 | 10.7% (6.4–17.5%) | 0.004 | yes |
| Light | 200 | 4.0% (2.0–7.7%) | 131 | 13.0% (8.3–19.8%) | 0.005 | yes |
| Intermediate | 188 | 3.2% (1.5–6.8%) | 120 | 5.8% (2.9–11.5%) | 0.384 | no |
| Tan | 160 | 6.2% (3.4–11.1%) | 130 | 17.7% (12.1–25.2%) | 0.003 | yes |
| Brown | 130 | 3.8% (1.6–8.7%) | 137 | 18.2% (12.7–25.5%) | <0.001 | yes |
| Dark | 121 | 2.5% (0.8–7.0%) | 126 | 11.9% (7.4–18.7%) | 0.006 | yes |
| W (%) | ALS (%) | ||||
|---|---|---|---|---|---|
| No filter | 450 nm O | 450 nm Y | 415 nm O | 415 nm Y | |
| Invalid Failed Consistent | 15.29 | 5.44 | 4.97 | 5.18 | 6.01 |
| Invalid Failed Inconsistent | 10.59 | 0 | 1.99 | 1.88 | 4.37 |
| Valid Overpredict Consistent | 1.18 | 0.49 | 0.49 | 1.41 | 2.18 |
| Valid Overpredict Inconsistent | 6.47 | 0.49 | 3.48 | 1.41 | 1.63 |
| Valid Perfect Consistent | 30.59 | 74.75 | 68.15 | 70.75 | 65.57 |
| Valid Perfect Inconsistent | 25.29 | 13.36 | 14.92 | 15.56 | 16.93 |
| Valid Underpredict Consistent | 4.71 | 3.96 | 4.47 | 2.83 | 2.73 |
| Valid Underpredict Inconsistent | 5.88 | 1.48 | 1.49 | 0.94 | 0.54 |
| Very Light | Light | Intermediate | Tan | Brown | Dark | |
|---|---|---|---|---|---|---|
| Invalid Failed Consistent | 7.27 | 5 | 6.38 | 8.12 | 8.46 | 8.26 |
| Invalid Failed Inconsistent | 1.21 | 2 | 2.13 | 6.87 | 3.85 | 6.61 |
| Valid Overpredict Consistent | 0 | 1 | 0.53 | 3.12 | 0.77 | 1.65 |
| Valid Overpredict Inconsistent | 2.42 | 3 | 2.66 | 3.12 | 3.08 | 0.83 |
| Valid Perfect Consistent | 64.85 | 59.5 | 69.15 | 56.25 | 66.92 | 61.98 |
| Valid Perfect Inconsistent | 22.42 | 22.5 | 14.89 | 17.5 | 10 | 10.74 |
| Valid Underpredict Consistent | 1.21 | 4.5 | 2.66 | 3.75 | 3.08 | 7.44 |
| Valid Underpredict Inconsistent | 0.61 | 2.5 | 1.6 | 1.25 | 3.85 | 2.48 |
| Model | Classification | Clustering | ||||
|---|---|---|---|---|---|---|
| Decision Tree | Random Forest | XGBoost | Sil. Score | ARI | CH Index | |
| FCOS | 0.66 | 0.82 | 0.78 | 0.27 | 0.010 | 393.95 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Aminfar, K.; Scafide, K.; Wojtusiak, J.; Lattanzi, D. Evaluation Framework for Bruise Detection: Systematic ALS/White-Light Training and Skin-Tone Balancing with Deep Learning. Sensors 2026, 26, 3215. https://doi.org/10.3390/s26103215
Aminfar K, Scafide K, Wojtusiak J, Lattanzi D. Evaluation Framework for Bruise Detection: Systematic ALS/White-Light Training and Skin-Tone Balancing with Deep Learning. Sensors. 2026; 26(10):3215. https://doi.org/10.3390/s26103215
Chicago/Turabian StyleAminfar, Kiyarash, Katherine Scafide, Janusz Wojtusiak, and David Lattanzi. 2026. "Evaluation Framework for Bruise Detection: Systematic ALS/White-Light Training and Skin-Tone Balancing with Deep Learning" Sensors 26, no. 10: 3215. https://doi.org/10.3390/s26103215
APA StyleAminfar, K., Scafide, K., Wojtusiak, J., & Lattanzi, D. (2026). Evaluation Framework for Bruise Detection: Systematic ALS/White-Light Training and Skin-Tone Balancing with Deep Learning. Sensors, 26(10), 3215. https://doi.org/10.3390/s26103215

