MDPI - Publisher of Open Access Journals

19 pages, 6095 KiB

Open AccessArticle

MERA: Medical Electronic Records Assistant

by Ahmed Ibrahim, Abdullah Khalili, Maryam Arabi, Aamenah Sattar, Abdullah Hosseini and Ahmed Serag

Mach. Learn. Knowl. Extr. 2025, 7(3), 73; https://doi.org/10.3390/make7030073 - 30 Jul 2025

Viewed by 394

Abstract

The increasing complexity and scale of electronic health records (EHRs) demand advanced tools for efficient data retrieval, summarization, and comparative analysis in clinical practice. MERA (Medical Electronic Records Assistant) is a Retrieval-Augmented Generation (RAG)-based AI system that addresses these needs by integrating domain-specific [...] Read more.

The increasing complexity and scale of electronic health records (EHRs) demand advanced tools for efficient data retrieval, summarization, and comparative analysis in clinical practice. MERA (Medical Electronic Records Assistant) is a Retrieval-Augmented Generation (RAG)-based AI system that addresses these needs by integrating domain-specific retrieval with large language models (LLMs) to deliver robust question answering, similarity search, and report summarization functionalities. MERA is designed to overcome key limitations of conventional LLMs in healthcare, such as hallucinations, outdated knowledge, and limited explainability. To ensure both privacy compliance and model robustness, we constructed a large synthetic dataset using state-of-the-art LLMs, including Mistral v0.3, Qwen 2.5, and Llama 3, and further validated MERA on de-identified real-world EHRs from the MIMIC-IV-Note dataset. Comprehensive evaluation demonstrates MERA’s high accuracy in medical question answering (correctness: 0.91; relevance: 0.98; groundedness: 0.89; retrieval relevance: 0.92), strong summarization performance (ROUGE-1 F1-score: 0.70; Jaccard similarity: 0.73), and effective similarity search (METEOR: 0.7–1.0 across diagnoses), with consistent results on real EHRs. The similarity search module empowers clinicians to efficiently identify and compare analogous patient cases, supporting differential diagnosis and personalized treatment planning. By generating concise, contextually relevant, and explainable insights, MERA reduces clinician workload and enhances decision-making. To our knowledge, this is the first system to integrate clinical question answering, summarization, and similarity search within a unified RAG-based framework. Full article

(This article belongs to the Special Issue Advances in Machine and Deep Learning)

► Show Figures

Figure 1

31 pages, 9156 KiB

Open AccessArticle

A Comparative Analysis of Deep Learning-Based Segmentation Techniques for Terrain Classification in Aerial Imagery

by Martina Formichini and Carlo Alberto Avizzano

AI 2025, 6(7), 145; https://doi.org/10.3390/ai6070145 - 3 Jul 2025

Viewed by 564

Abstract

Background: Deep convolutional neural networks (CNNs) have become widely popular for many imaging applications, and they have also been applied in various studies for monitoring and mapping areas of land. Nevertheless, most of these networks were designed to perform in different scenarios, such [...] Read more.

Background: Deep convolutional neural networks (CNNs) have become widely popular for many imaging applications, and they have also been applied in various studies for monitoring and mapping areas of land. Nevertheless, most of these networks were designed to perform in different scenarios, such as autonomous driving and medical imaging. Methods: In this work, we focused on the usage of existing semantic networks applied to terrain segmentation. Even though several existing networks have been used to study land segmentation using transfer learning methodologies, a comparative analysis of how the underlying network architectures perform has not yet been conducted. Since this scenario is different from the one in which these networks were developed, featuring irregular shapes and an absence of models, not all of them can be correctly transferred to this domain. Results: Fifteen state-of-the-art neural networks were compared, and we found that, in addition to slight differences in performance, there were relevant differences in the numbers and types of outliers that were worth highlighting. Our results show that the best-performing models achieved a pixel-level class accuracy of 99.06%, with an F1-score of 72.94%, 71.5% Jaccard loss, and 88.43% recall. When investigating the outliers, we found that PSPNet, FCN, and ICNet were the most effective models. Conclusions: While most of this work was performed on an existing terrain dataset collected using aerial imagery, this approach remains valid for investigation of other datasets with more classes or richer geographical extensions. For example, a dataset composed of Copernicus images opens up new opportunities for large-scale terrain analysis. Full article

► Show Figures

Figure 1

13 pages, 750 KiB

Open AccessArticle

Semantic Evaluation of Nursing Assessment Scales Translations by ChatGPT 4.0: A Lexicometric Analysis

by Mauro Parozzi, Mattia Bozzetti, Alessio Lo Cascio, Daniele Napolitano, Roberta Pendoni, Ilaria Marcomini, Elena Sblendorio, Giovanni Cangelosi, Stefano Mancin and Antonio Bonacaro

Nurs. Rep. 2025, 15(6), 211; https://doi.org/10.3390/nursrep15060211 - 11 Jun 2025

Cited by 2 | Viewed by 1023 | Correction

Abstract

Background/Objectives: The use of standardized assessment tools within the nursing care process is a globally established practice, widely recognized as a foundation for evidence-based evaluation. Accurate translation is essential to ensure their correct and consistent clinical use. While effective, traditional procedures are [...] Read more.

Background/Objectives: The use of standardized assessment tools within the nursing care process is a globally established practice, widely recognized as a foundation for evidence-based evaluation. Accurate translation is essential to ensure their correct and consistent clinical use. While effective, traditional procedures are time-consuming and resource-intensive, leading to increasing interest in whether artificial intelligence can assist or streamline this process for nursing researchers. Therefore, this study aimed to assess the translation’s quality of nursing assessment scales performed by ChatGPT 4.0. Methods: A total of 31 nursing rating scales with 772 items were translated from English to Italian using two different prompts, and then underwent a deep lexicometric analysis. To assess the semantic accuracy of the translations the Sentence-BERT, Jaccard similarity, TF-IDF cosine similarity, and Overlap ratio were used. Sensitivity, specificity, AUC, and AUROC were calculated to assess the quality of the translation classification. Paired-sample t-tests were conducted to compare the similarity scores. Results: The Maastricht prompt produced translations that are marginally but consistently more semantically and lexically faithful to the original. While all differences were found to be statistically significant, the corresponding effect sizes indicate that the advantage of the Maastricht prompt is slight but consistent across all measures. The sensitivity of the prompts was 0.929 (92.9%) for York and 0.932 (93.2%) for Maastricht. Specificity and precision remained for both at 1.000. Conclusions: Findings highlight the potential of prompt engineering as a low-cost, effective method to enhance translation outcomes. Nonetheless, as translation represents only a preliminary step in the full validation process, further studies should investigate the integration of AI-assisted translation within the broader framework of instrument adaptation and validation. Full article

(This article belongs to the Section Artificial Intelligence and Digital Innovations in Nursing Care)

► Show Figures

Graphical abstract

13 pages, 2240 KiB

Open AccessArticle

Monocular 3D Tooltip Tracking in Robotic Surgery—Building a Multi-Stage Pipeline

by Sanjeev Narasimhan, Mehmet Kerem Turkcan, Mattia Ballo, Sarah Choksi, Filippo Filicori and Zoran Kostic

Electronics 2025, 14(10), 2075; https://doi.org/10.3390/electronics14102075 - 20 May 2025

Cited by 1 | Viewed by 1129

Abstract

Tracking the precise movement of surgical tools is essential for enabling automated analysis, providing feedback, and enhancing safety in robotic-assisted surgery. Accurate 3D tracking of surgical tooltips is challenging to implement when using monocular videos due to the complexity of extracting depth information. [...] Read more.

Tracking the precise movement of surgical tools is essential for enabling automated analysis, providing feedback, and enhancing safety in robotic-assisted surgery. Accurate 3D tracking of surgical tooltips is challenging to implement when using monocular videos due to the complexity of extracting depth information. We propose a pipeline that combines state-of-the-art foundation models—Florence2 and Segment Anything 2 (SAM2)—for zero-shot 2D localization of tooltip coordinates using a monocular video input. Localization predictions are refined through supervised training of the YOLOv11 segmentation model to enable real-time applications. The depth estimation model Metric3D computes the relative depth and provides tooltip camera coordinates, which are subsequently transformed into world coordinates via a linear model estimating rotation and translation parameters. An experimental evaluation on the JIGSAWS Suturing Kinematic dataset achieves a 3D Average Jaccard score on tooltip tracking of 84.5 and 91.2 for the zero-shot and supervised approaches, respectively. The results validate the effectiveness of our approach and its potential to enhance real-time guidance and assessment in robotic-assisted surgical procedures. Full article

(This article belongs to the Special Issue Advances in Image Processing and Computer Vision Based on Machine Learning, 2nd Edition)

► Show Figures

Figure 1

32 pages, 6855 KiB

Open AccessArticle

Advancing CVD Risk Prediction with Transformer Architectures and Statistical Risk Factor Filtering

by Parul Dubey, Pushkar Dubey and Pitshou N. Bokoro

Technologies 2025, 13(5), 201; https://doi.org/10.3390/technologies13050201 - 14 May 2025

Viewed by 733

Abstract

Cardiovascular disease (CVD) remains one of the leading causes of mortality worldwide, demanding accurate and timely prediction methods. Recent advancements in artificial intelligence have shown promise in enhancing clinical decision-making for CVD diagnosis. However, many existing models fail to distinguish between statistically significant [...] Read more.

Cardiovascular disease (CVD) remains one of the leading causes of mortality worldwide, demanding accurate and timely prediction methods. Recent advancements in artificial intelligence have shown promise in enhancing clinical decision-making for CVD diagnosis. However, many existing models fail to distinguish between statistically significant and redundant risk factors, resulting in reduced interpretability and potential overfitting. This research addresses the need for a clinically meaningful and computationally efficient prediction model. The study utilizes three real-world datasets comprising demographic, clinical, and lifestyle-based risk factors relevant to CVD. A novel methodology is proposed, integrating the HEART framework for statistical feature optimization with a Transformer-based deep learning model for classification. The HEART framework employs correlation-based filtering, Akaike information criterion (AIC), and statistical significance testing to refine feature subsets. The novelty lies in combining statistical risk factor filtration with attention-driven learning, enhancing both model performance and interpretability. The proposed model is evaluated using key metrics, including accuracy, precision, recall, F1-score, AUC, and Jaccard index. Experimental results show that the Transformer model significantly outperforms baseline models, achieving 93.1% accuracy and 0.957 AUC, confirming its potential for reliable CVD prediction. Full article

(This article belongs to the Section Assistive Technologies)

► Show Figures

Figure 1

16 pages, 7715 KiB

Open AccessArticle

Comprehensive Evaluation of Paprika Instance Segmentation Models Based on Segmentation Quality and Confidence Score Reliability

by Nozomu Ohta, Kota Shimomoto, Hiroki Naito, Masakazu Kashino, Sota Yoshida and Tokihiro Fukatsu

Horticulturae 2025, 11(5), 525; https://doi.org/10.3390/horticulturae11050525 - 13 May 2025

Viewed by 465

Abstract

Fruit instance segmentation models are widely researched for inference tasks such as yield prediction and automated harvesting. Previous studies evaluated these models only on the basis of mask average precision; they overlooked segmentation quality and confidence score reliability—both crucial for inference tasks. This [...] Read more.

Fruit instance segmentation models are widely researched for inference tasks such as yield prediction and automated harvesting. Previous studies evaluated these models only on the basis of mask average precision; they overlooked segmentation quality and confidence score reliability—both crucial for inference tasks. This study proposes an evaluation method that incorporates, in addition to mask average precision, the Aggregated Jaccard Index to assess segmentation quality and the coefficient of determination between confidence scores and Intersection over Union for reliability evaluation. We compared YOLO11, Mask R-CNN, and their improved variants using a dataset that included stable and comprehensive imaging obtained by monitoring equipment in a large-scale commercial paprika greenhouse. Results show that mask scoring R-CNN excels in segmentation quality, while YOLO11 performs better in mask average precision and confidence score reliability. These findings suggest that when evaluating instance segmentation models for real-world application scenarios, we should not rely solely on mask average precision, but a combination of multiple metrics that assess different aspects of the model must be utilized. Full article

(This article belongs to the Special Issue Application of Smart Technology and Equipment in Horticulture—2nd Edition)

► Show Figures

Figure 1

25 pages, 9072 KiB

Open AccessArticle

An Application Study of Machine Learning Methods for Lithological Classification Based on Logging Data in the Permafrost Zones of the Qilian Mountains

by Xudong Hu, Guo Song, Chengnan Wang, Kun Xiao, Hai Yuan, Wangfeng Leng and Yiming Wei

Processes 2025, 13(5), 1475; https://doi.org/10.3390/pr13051475 - 12 May 2025

Cited by 1 | Viewed by 490

Abstract

Lithology identification is fundamental for the logging evaluation of natural gas hydrate reservoirs. The Sanlutian field, located in the permafrost zones of the Qilian Mountains (PZQM), presents unique challenges for lithology identification due to its complex geological features, including fault development, missing and [...] Read more.

Lithology identification is fundamental for the logging evaluation of natural gas hydrate reservoirs. The Sanlutian field, located in the permafrost zones of the Qilian Mountains (PZQM), presents unique challenges for lithology identification due to its complex geological features, including fault development, missing and duplicated stratigraphy, and a diverse array of rock types. Conventional methods frequently encounter difficulties in precisely discerning these rock types. This study employs well logging and core data from hydrate boreholes in the region to evaluate the performance of four data-driven machine learning (ML) algorithms for lithological classification: random forest (RF), multi-layer perceptron (MLP), logistic regression (LR), and decision tree (DT). The results indicate that seven principal lithologies—sandstone, siltstone, argillaceous siltstone, silty mudstone, mudstone, oil shale, and coal—can be effectively distinguished through the analysis of logging data. Among the tested models, the random forest algorithm demonstrated superior performance, achieving optimal precision, recall, F1-score, and Jaccard coefficient values of 0.941, 0.941, 0.940, and 0.889, respectively. The models were ranked in the following order based on evaluation criteria: RF > MLP > DT > LR. This research highlights the potential of integrating artificial intelligence with logging data to enhance lithological classification in complex geological settings, providing valuable technical support for the exploration and development of gas hydrate resources. Full article

► Show Figures

Figure 1

24 pages, 12924 KiB

Open AccessArticle

Analysis of Forest Change Detection Induced by Hurricane Helene Using Remote Sensing Data

by Rizwan Ahmed Ansari, Tony Esimaje, Oluwatosin Michael Ibrahim and Timothy Mulrooney

Forests 2025, 16(5), 788; https://doi.org/10.3390/f16050788 - 8 May 2025

Cited by 1 | Viewed by 510

Abstract

The occurrence of hurricanes in the southern U.S. is on the rise, and assessing the damage caused to forests is essential for implementing protective measures and comprehending recovery dynamics. This work aims to create a novel data integration framework that employs LANDSAT 8, [...] Read more.

The occurrence of hurricanes in the southern U.S. is on the rise, and assessing the damage caused to forests is essential for implementing protective measures and comprehending recovery dynamics. This work aims to create a novel data integration framework that employs LANDSAT 8, drone-based images, and geographic information system data for change detection analysis for different forest types. We propose a method for change vector analysis based on a unique spectral mixture model utilizing composite spectral indices along with univariate difference imaging to create a change detection map illustrating disturbances in the areas of McDowell County in western North Carolina impacted by Hurricane Helene. The spectral indices included near-infrared-to-red ratios, a normalized difference vegetation index, Tasseled Cap indices, and a soil-adjusted vegetation index. In addition to the satellite imagery, the ground truth data of forest damage were also collected through the field investigation and interpretation of post-Helene drone images. Accuracy assessment was conducted with geographic information system (GIS) data and maps from the National Land Cover Database. Accuracy assessment was carried out using metrics such as overall accuracy, precision, recall, F score, Jaccard similarity, and kappa statistics. The proposed composite method performed well with overall accuracy and Jaccard similarity values of 73.80% and 0.6042, respectively. The results exhibit a reasonable correlation with GIS data and can be employed to assess damage severity. Full article

(This article belongs to the Special Issue Forest Resources and Land Use/Land Cover Dynamics: Implications for Climate Change Mitigation and Adaptation)

► Show Figures

Figure 1

11 pages, 226 KiB

Open AccessCommunication

A Comparison of Artificial Intelligence and Human Observation in the Assessment of Cattle Handling and Slaughter

by Lily Edwards-Callaway, Huey Yi Loh, Carina Kautsky and Paxton Sullivan

Animals 2025, 15(9), 1325; https://doi.org/10.3390/ani15091325 - 3 May 2025

Viewed by 1232

Abstract

Slaughter facilities use a variety of tools to evaluate animal handling, including but not limited to live audits, the use of remote video auditing, and some AI technologies. The objective of this study was to determine the similarity between AI and human evaluator [...] Read more.

Slaughter facilities use a variety of tools to evaluate animal handling, including but not limited to live audits, the use of remote video auditing, and some AI technologies. The objective of this study was to determine the similarity between AI and human evaluator assessments of critical cattle handling outcomes in a slaughter plant. One hundred twelve video clips of cattle handling and stunning from a slaughter plant in the United Kingdom were collected. The AI identified the presence or absence of: Stunning, Electric Prod Usage, Falling, Pen Crowding, and Questionable Handling Events. Three human evaluators scored the videos for these outcomes. Four different datasets were generated, and Jaccard similarity indices were generated. There was high similarity (JI > 0.90) for Stunning, Electric Prod Usage, and Falls between the evaluators and the AI. There was high consistency (JI > 0.80) for Pen Crowding. There were differences (JI ≥ 0.50) between the humans and the AI when identifying Questionable Animal Handling Events but the AI was adept at identifying events for further review. The implementation of AI to assist with cattle handling in a slaughter facility environment could be an added tool to enhance animal welfare programs. Full article

(This article belongs to the Special Issue Animal Production in the Artificial Intelligence Era: Advances and Applications)

26 pages, 10897 KiB

Open AccessArticle

LiDAR-Based Road Cracking Detection: Machine Learning Comparison, Intensity Normalization, and Open-Source WebGIS for Infrastructure Maintenance

by Nicole Pascucci, Donatella Dominici and Ayman Habib

Remote Sens. 2025, 17(9), 1543; https://doi.org/10.3390/rs17091543 - 26 Apr 2025

Viewed by 1211

Abstract

This study introduces an innovative and scalable approach for automated road surface assessment by integrating Mobile Mapping System (MMS)-based LiDAR data analysis with an open-source WebGIS platform. In a U.S.-based case study, over 20 datasets were collected along Interstate I-65 in West Lafayette, [...] Read more.

This study introduces an innovative and scalable approach for automated road surface assessment by integrating Mobile Mapping System (MMS)-based LiDAR data analysis with an open-source WebGIS platform. In a U.S.-based case study, over 20 datasets were collected along Interstate I-65 in West Lafayette, Indiana, using the Purdue Wheel-based Mobile Mapping System—Ultra High Accuracy (PWMMS-UHA), following Indiana Department of Transportation (INDOT) guidelines. Preprocessing included noise removal, resolution reduction to 2 cm, and ground/non-ground separation using the Cloth Simulation Filter (CSF), resulting in Bare Earth (BE), Digital Terrain Model (DTM), and Above Ground (AG) point clouds. The optimized BE layer, enriched with intensity and color information, enabled crack detection through Density-Based Spatial Clustering of Applications with Noise (DBSCAN) and Random Forest (RF) classification, with and without intensity normalization. DBSCAN parameter tuning was guided by silhouette scores, while model performance was evaluated using precision, recall, F1-score, and the Jaccard Index, benchmarked against reference data. Results demonstrate that RF consistently outperformed DBSCAN, particularly under intensity normalization, achieving Jaccard Index values of 94% for longitudinal and 88% for transverse cracks. A key contribution of this work is the integration of geospatial analytics into an interactive, open-source WebGIS environment—developed using Blender, QGIS, and Lizmap—to support predictive maintenance planning. Moreover, intervention thresholds were defined based on crack surface area, aligned with the Pavement Condition Index (PCI) and FHWA standards, offering a data-driven framework for infrastructure monitoring. This study emphasizes the practical advantages of comparing clustering and machine learning techniques on 3D LiDAR point clouds, both with and without intensity normalization, and proposes a replicable, computationally efficient alternative to deep learning methods, which often require extensive training datasets and high computational resources. Full article

► Show Figures

Figure 1

21 pages, 3234 KiB

Open AccessArticle

Pre- Trained Language Models for Mental Health: An Empirical Study on Arabic Q&A Classification

by Hassan Alhuzali and Ashwag Alasmari

Healthcare 2025, 13(9), 985; https://doi.org/10.3390/healthcare13090985 - 24 Apr 2025

Viewed by 867

Abstract

Background: Pre-Trained Language Models hold significant promise for revolutionizing mental health care by delivering accessible and culturally sensitive resources. Despite this potential, their efficacy in mental health applications, particularly in the Arabic language, remains largely unexplored. To the best of our knowledge, comprehensive [...] Read more.

Background: Pre-Trained Language Models hold significant promise for revolutionizing mental health care by delivering accessible and culturally sensitive resources. Despite this potential, their efficacy in mental health applications, particularly in the Arabic language, remains largely unexplored. To the best of our knowledge, comprehensive studies specifically evaluating the performance of PLMs on diverse Arabic mental health tasks are still scarce. This study aims to bridge this gap by evaluating the performance of pre-trained language models in classifying questions and answers within the mental health care domain. Methods: We used the MentalQA dataset, which comprises Arabic Questions and Answers interactions related to mental health. Our experiments involved four distinct learning strategies: traditional feature extraction, using PLMs as feature extractors, fine-tuning PLMs, and employing prompt-based techniques with models, such as GPT-3.5 and GPT-4 in zero-shot and few-shot learning scenarios. Arabic-specific PLMs, including AraBERT, CAMelBERT, and MARBERT, were evaluated. Results: Traditional feature-extraction methods paired with Support Vector Machines (SVM) showed competitive performance, but PLMs outperformed them due to their superior ability to capture semantic nuances. In particular, MARBERT achieved the highest performance, with Jaccard scores of 0.80 for the question classification and 0.86 for the answer classification. Further analysis revealed that fine-tuning PLMs enhances their performance, and the size of the training dataset plays a critical role in model effectiveness. Prompt-based techniques, particularly few-shot learning with GPT-3.5, demonstrated significant improvements, increasing the accuracy of question classification by 12% and the accuracy of answer classification by 45%. Conclusions: The study demonstrates the potential of PLMs and prompt-based approaches to provide mental health support to Arabic-speaking populations, providing valuable tools for individuals seeking assistance in this field. This research advances the understanding of PLMs in mental health care and emphasizes their potential to improve accessibility and effectiveness in Arabic-speaking contexts. Full article

(This article belongs to the Section Health Informatics and Big Data)

► Show Figures

Figure 1

30 pages, 10466 KiB

Open AccessArticle

Prompt Once, Segment Everything: Leveraging SAM 2 Potential for Infinite Medical Image Segmentation with a Single Prompt

by Juan D. Gutiérrez, Emilio Delgado, Carlos Breuer, José M. Conejero and Roberto Rodriguez-Echeverria

Algorithms 2025, 18(4), 227; https://doi.org/10.3390/a18040227 - 14 Apr 2025

Cited by 1 | Viewed by 1758

Abstract

Semantic segmentation of medical images holds significant potential for enhancing diagnostic and surgical procedures. Radiology specialists can benefit from automated segmentation tools that facilitate identifying and isolating regions of interest in medical scans. Nevertheless, to obtain precise results, sophisticated deep learning models tailored [...] Read more.

Semantic segmentation of medical images holds significant potential for enhancing diagnostic and surgical procedures. Radiology specialists can benefit from automated segmentation tools that facilitate identifying and isolating regions of interest in medical scans. Nevertheless, to obtain precise results, sophisticated deep learning models tailored to this specific task must be developed and trained, a capability not universally accessible. Segment Anything Model (SAM) 2 is a foundational model designed for image and video segmentation tasks, built on its predecessor, SAM. This paper introduces a novel approach leveraging SAM 2’s video segmentation capabilities to reduce the prompts required to segment an entire volume of medical images. The study first compares SAM and SAM 2’s performance in medical image segmentation. Evaluation metrics such as the Jaccard index and Dice score are used to measure precision and segmentation quality. Then, our novel approach is introduced. Statistical tests include comparing precision gains and computational efficiency, focusing on the trade-off between resource use and segmentation time. The results show that SAM 2 achieves an average improvement of 1.76% in the Jaccard index and 1.49% in the Dice score compared to SAM, albeit with a ten-fold increase in segmentation time. Our novel approach to segmentation reduces the number of prompts needed to segment a volume of medical images by 99.95%. We demonstrate that it is possible to segment all the slices of a volume and, even more, of a whole dataset, with a single prompt, achieving results comparable to those obtained by state-of-the-art models explicitly trained for this task. Our approach simplifies the segmentation process, allowing specialists to devote more time to other tasks. The hardware and personnel requirements to obtain these results are much lower than those needed to train a deep learning model from scratch or to modify the behavior of an existing one using model modification techniques. Full article

(This article belongs to the Special Issue Algorithms and Applications of Machine Learning Techniques for Healthcare)

► Show Figures

Figure 1

17 pages, 7271 KiB

Open AccessArticle

A Multitask CNN for Near-Infrared Probe: Enhanced Real-Time Breast Cancer Imaging

by Maryam Momtahen and Farid Golnaraghi

Sensors 2025, 25(8), 2349; https://doi.org/10.3390/s25082349 - 8 Apr 2025

Viewed by 580

Abstract

The early detection of breast cancer, particularly in dense breast tissues, faces significant challenges with traditional imaging techniques such as mammography. This study utilizes a Near-infrared Scan (NIRscan) probe and an advanced convolutional neural network (CNN) model to enhance tumor localization accuracy and [...] Read more.

The early detection of breast cancer, particularly in dense breast tissues, faces significant challenges with traditional imaging techniques such as mammography. This study utilizes a Near-infrared Scan (NIRscan) probe and an advanced convolutional neural network (CNN) model to enhance tumor localization accuracy and efficiency. CNN processed data from 133 breast phantoms into 266 samples using data augmentation techniques, such as mirroring. The model significantly improved image reconstruction, achieving an RMSE of 0.0624, MAE of 0.0360, R² of 0.9704, and Fuzzy Jaccard Index of 0.9121. Subsequently, we introduced a multitask CNN that reconstructs images and classifies them based on depth, length, and health status, further enhancing its diagnostic capabilities. This multitasking approach leverages the robust feature extraction capabilities of CNNs to perform complex tasks simultaneously, thereby improving the model’s efficiency and accuracy. It achieved exemplary classification accuracies in depth (100%), length (92.86%), and health status, with a perfect F1 Score. These results highlight the promise of NIRscan technology, in combination with a multitask CNN model, as a supportive tool for improving real-time breast cancer screening and diagnostic workflows. Full article

(This article belongs to the Special Issue Vision- and Image-Based Biomedical Diagnostics—2nd Edition)

► Show Figures

Figure 1

35 pages, 7271 KiB

Open AccessArticle

Multimodal Data Fusion for Tabular and Textual Data: Zero-Shot, Few-Shot, and Fine-Tuning of Generative Pre-Trained Transformer Models

by Shadi Jaradat, Mohammed Elhenawy, Richi Nayak, Alexander Paz, Huthaifa I. Ashqar and Sebastien Glaser

AI 2025, 6(4), 72; https://doi.org/10.3390/ai6040072 - 7 Apr 2025

Cited by 1 | Viewed by 3391

Abstract

In traffic safety analysis, previous research has often focused on tabular data or textual crash narratives in isolation, neglecting the potential benefits of a hybrid multimodal approach. This study introduces the Multimodal Data Fusion (MDF) framework, which fuses tabular data with textual narratives [...] Read more.

In traffic safety analysis, previous research has often focused on tabular data or textual crash narratives in isolation, neglecting the potential benefits of a hybrid multimodal approach. This study introduces the Multimodal Data Fusion (MDF) framework, which fuses tabular data with textual narratives by leveraging advanced Large Language Models (LLMs), such as GPT-2, GPT-3.5, and GPT-4.5, using zero-shot (ZS), few-shot (FS), and fine-tuning (FT) learning strategies. We employed few-shot learning with GPT-4.5 to generate new labels for traffic crash analysis, such as driver fault, driver actions, and crash factors, alongside the existing label for severity. Our methodology was tested on crash data from the Missouri State Highway Patrol, demonstrating significant improvements in model performance. GPT-2 (fine-tuned) was used as the baseline model, against which more advanced models were evaluated. GPT-4.5 few-shot learning achieved 98.9% accuracy for crash severity prediction and 98.1% accuracy for driver fault classification. In crash factor extraction, GPT-4.5 few-shot achieved the highest Jaccard score (82.9%), surpassing GPT-3.5 and fine-tuned GPT-2 models. Similarly, in driver actions extraction, GPT-4.5 few-shot attained a Jaccard score of 73.1%, while fine-tuned GPT-2 closely followed with 72.2%, demonstrating that task-specific fine-tuning can achieve performance close to state-of-the-art models when adapted to domain-specific data. These findings highlight the superior performance of GPT-4.5 few-shot learning, particularly in classification and information extraction tasks, while also underscoring the effectiveness of fine-tuning on domain-specific datasets to bridge performance gaps with more advanced models. The MDF framework’s success demonstrates its potential for broader applications beyond traffic crash analysis, particularly in domains where labeled data are scarce and predictive modeling is essential. Full article

(This article belongs to the Section AI Systems: Theory and Applications)

► Show Figures

Figure 1

21 pages, 6744 KiB

Open AccessArticle

MADC-Net: Densely Connected Network with Multi-Attention for Metal Surface Defect Segmentation

by Xiaokang Ding, Xiaoliang Jiang and Sheng Wang

Symmetry 2025, 17(4), 518; https://doi.org/10.3390/sym17040518 - 29 Mar 2025

Viewed by 367

Abstract

The quality of metal products plays a crucial role in determining their overall performance, reliability and safety. Therefore, timely and effective detection of metal surface defects is of great significance. For this purpose, we present a densely connected network with multi-attention for metal [...] Read more.

The quality of metal products plays a crucial role in determining their overall performance, reliability and safety. Therefore, timely and effective detection of metal surface defects is of great significance. For this purpose, we present a densely connected network with multi-attention for metal surface defect segmentation, called MADC-Net. Firstly, we selected ResNet50 as the encoder due to its robust performance. To capture richer contextual information from the defect feature map, we designed a densely connected network and incorporated the multi-attention of a CESConv module, an efficient channel attention module (ECAM), and a simple attention module (SimAM) into the decoder. In addition, in the final stage of the decoder, we introduced a reconfigurable efficient attention module (REAM) to reduce redundant calculations and enhance the detection of complex defect structures. Finally, a series of comprehensive comparative and ablation experiments were conducted on the publicly available SD-saliency-900 dataset and our self-constructed Bearing dataset, all of which validated that our proposed method was effective and reliable in defect segmentation. Specifically, the Dice and Jaccard scores for the SD-saliency-900 dataset were 88.82% and 79.96%. In comparison, for the Bearing dataset, the Dice score was 78.24% and the Jaccard score was 64.74%. Full article

(This article belongs to the Special Issue Symmetry and Its Applications in Image Processing)

► Show Figures

Figure 1

Search Results (95)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (95)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI