Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (32)

Search Parameters:
Keywords = volume-level annotations

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
24 pages, 3507 KB  
Article
A Semi-Supervised Wildfire Image Segmentation Network with Multi-Scale Structural Fusion and Pixel-Level Contrastive Consistency
by Yong Sun, Wei Wei, Jia Guo, Haifeng Lin and Yiqing Xu
Fire 2025, 8(8), 313; https://doi.org/10.3390/fire8080313 - 7 Aug 2025
Viewed by 1443
Abstract
The increasing frequency and intensity of wildfires pose serious threats to ecosystems, property, and human safety worldwide. Accurate semantic segmentation of wildfire images is essential for real-time fire monitoring, spread prediction, and disaster response. However, existing deep learning methods heavily rely on large [...] Read more.
The increasing frequency and intensity of wildfires pose serious threats to ecosystems, property, and human safety worldwide. Accurate semantic segmentation of wildfire images is essential for real-time fire monitoring, spread prediction, and disaster response. However, existing deep learning methods heavily rely on large volumes of pixel-level annotated data, which are difficult and costly to obtain in real-world wildfire scenarios due to complex environments and urgent time constraints. To address this challenge, we propose a semi-supervised wildfire image segmentation framework that enhances segmentation performance under limited annotation conditions by integrating multi-scale structural information fusion and pixel-level contrastive consistency learning. Specifically, a Lagrange Interpolation Module (LIM) is designed to construct structured interpolation representations between multi-scale feature maps during the decoding stage, enabling effective fusion of spatial details and semantic information, and improving the model’s ability to capture flame boundaries and complex textures. Meanwhile, a Pixel Contrast Consistency (PCC) mechanism is introduced to establish pixel-level semantic constraints between CutMix and Flip augmented views, guiding the model to learn consistent intra-class and discriminative inter-class feature representations, thereby reducing the reliance on large labeled datasets. Extensive experiments on two public wildfire image datasets, Flame and D-Fire, demonstrate that our method consistently outperforms other approaches under various annotation ratios. For example, with only half of the labeled data, our model achieves 5.0% and 6.4% mIoU improvements on the Flame and D-Fire datasets, respectively, compared to the baseline. This work provides technical support for efficient wildfire perception and response in practical applications. Full article
Show Figures

Figure 1

20 pages, 11920 KB  
Article
Enhancing Tip Detection by Pre-Training with Synthetic Data for Ultrasound-Guided Intervention
by Ruixin Wang, Jinghang Wang, Wei Zhao, Xiaohui Liu, Guoping Tan, Jun Liu and Zhiyuan Wang
Diagnostics 2025, 15(15), 1926; https://doi.org/10.3390/diagnostics15151926 - 31 Jul 2025
Viewed by 838
Abstract
Objectives: Automatic tip localization is critical in ultrasound (US)-guided interventions. Although deep learning (DL) has been widely used for precise tip detection, existing methods are limited by the availability of real puncture data and expert annotations. Methods: To address these challenges, [...] Read more.
Objectives: Automatic tip localization is critical in ultrasound (US)-guided interventions. Although deep learning (DL) has been widely used for precise tip detection, existing methods are limited by the availability of real puncture data and expert annotations. Methods: To address these challenges, we propose a novel method that uses synthetic US puncture data to pre-train DL-based tip detectors, improving their generalization. Synthetic data are generated by fusing clinical US images of healthy controls with tips created using generative DL models. To ensure clinical diversity, we constructed a dataset from scans of 20 volunteers, covering 20 organs or anatomical regions, obtained with six different US machines and performed by three physicians with varying expertise levels. Tip diversity is introduced by generating a wide range of synthetic tips using a denoising probabilistic diffusion model (DDPM). This method synthesizes a large volume of diverse US puncture data, which are used to pre-train tip detectors, followed by subsequently training with real puncture data. Results: Our method outperforms MSCOCO pre-training on a clinical puncture dataset, achieving a 1.27–7.19% improvement in AP0.1:0.5 with varying numbers of real samples. State-of-the-art detectors also show performance gains of 1.14–1.76% when applying the proposed method. Conclusions: The experimental results demonstrate that our method enhances the generalization of tip detectors without relying on expert annotations or large amounts of real data, offering significant potential for more accurate visual guidance during US-guided interventions and broader clinical applications. Full article
(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)
Show Figures

Figure 1

35 pages, 954 KB  
Article
Beyond Manual Media Coding: Evaluating Large Language Models and Agents for News Content Analysis
by Stavros Doropoulos, Elisavet Karapalidou, Polychronis Charitidis, Sophia Karakeva and Stavros Vologiannidis
Appl. Sci. 2025, 15(14), 8059; https://doi.org/10.3390/app15148059 - 20 Jul 2025
Cited by 3 | Viewed by 2659
Abstract
The vast volume of media content, combined with the costs of manual annotation, challenges scalable codebook analysis and risks reducing decision-making accuracy. This study evaluates the effectiveness of large language models (LLMs) and multi-agent teams in structured media content analysis based on codebook-driven [...] Read more.
The vast volume of media content, combined with the costs of manual annotation, challenges scalable codebook analysis and risks reducing decision-making accuracy. This study evaluates the effectiveness of large language models (LLMs) and multi-agent teams in structured media content analysis based on codebook-driven annotation. We construct a dataset of 200 news articles on U.S. tariff policies, manually annotated using a 26-question codebook encompassing 122 distinct codes, to establish a rigorous ground truth. Seven state-of-the-art LLMs, spanning low- to high-capacity tiers, are assessed under a unified zero-shot prompting framework incorporating role-based instructions and schema-constrained outputs. Experimental results show weighted global F1-scores between 0.636 and 0.822, with Claude-3-7-Sonnet achieving the highest direct-prompt performance. To examine the potential of agentic orchestration, we propose and develop a multi-agent system using Meta’s Llama 4 Maverick, incorporating expert role profiling, shared memory, and coordinated planning. This architecture improves the overall F1-score over the direct prompting baseline from 0.757 to 0.805 and demonstrates consistent gains across binary, categorical, and multi-label tasks, approaching commercial-level accuracy while maintaining a favorable cost–performance profile. These findings highlight the viability of LLMs, both in direct and agentic configurations, for automating structured content analysis. Full article
(This article belongs to the Special Issue Natural Language Processing in the Era of Artificial Intelligence)
Show Figures

Figure 1

19 pages, 3923 KB  
Article
Automated Aneurysm Boundary Detection and Volume Estimation Using Deep Learning
by Alireza Bagheri Rajeoni, Breanna Pederson, Susan M. Lessner and Homayoun Valafar
Diagnostics 2025, 15(14), 1804; https://doi.org/10.3390/diagnostics15141804 - 17 Jul 2025
Viewed by 1029
Abstract
Background/Objective: Precise aneurysm volume measurement offers a transformative edge for risk assessment and treatment planning in clinical settings. Currently, clinical assessments rely heavily on manual review of medical imaging, a process that is time-consuming and prone to inter-observer variability. The widely accepted standard [...] Read more.
Background/Objective: Precise aneurysm volume measurement offers a transformative edge for risk assessment and treatment planning in clinical settings. Currently, clinical assessments rely heavily on manual review of medical imaging, a process that is time-consuming and prone to inter-observer variability. The widely accepted standard of care primarily focuses on measuring aneurysm diameter at its widest point, providing a limited perspective on aneurysm morphology and lacking efficient methods to measure aneurysm volumes. Yet, volume measurement can offer deeper insight into aneurysm progression and severity. In this study, we propose an automated approach that leverages the strengths of pre-trained neural networks and expert systems to delineate aneurysm boundaries and compute volumes on an unannotated dataset from 60 patients. The dataset includes slice-level start/end annotations for aneurysm but no pixel-wise aorta segmentations. Method: Our method utilizes a pre-trained UNet to automatically locate the aorta, employs SAM2 to track the aorta through vascular irregularities such as aneurysms down to the iliac bifurcation, and finally uses a Long Short-Term Memory (LSTM) network or expert system to identify the beginning and end points of the aneurysm within the aorta. Results: Despite no manual aorta segmentation, our approach achieves promising accuracy, predicting the aneurysm start point with an R2 score of 71%, the end point with an R2 score of 76%, and the volume with an R2 score of 92%. Conclusions: This technique has the potential to facilitate large-scale aneurysm analysis and improve clinical decision-making by reducing dependence on annotated datasets. Full article
(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)
Show Figures

Figure 1

16 pages, 1687 KB  
Article
Towards Precision Medicine in Sinonasal Tumors: Low-Dimensional Radiomic Signature Extraction from MRI
by Riccardo Biondi, Giacomo Gravante, Daniel Remondini, Sara Peluso, Serena Cominetti, Francesco D’Amore, Maurizio Bignami, Alberto Daniele Arosio and Nico Curti
Diagnostics 2025, 15(13), 1675; https://doi.org/10.3390/diagnostics15131675 - 30 Jun 2025
Viewed by 949
Abstract
Background: Sinonasal tumors are rare, accounting for 3–5% of head and neck neoplasms. Machine learning (ML) and radiomics have shown promise in tumor classification, but current models lack detailed morphological and textural characterization. Methods: This study analyzed MRI data from 145 patients (76 [...] Read more.
Background: Sinonasal tumors are rare, accounting for 3–5% of head and neck neoplasms. Machine learning (ML) and radiomics have shown promise in tumor classification, but current models lack detailed morphological and textural characterization. Methods: This study analyzed MRI data from 145 patients (76 malignant and 69 benign) across multiple centers. Radiomic features were extracted from T1-weighted (T1-w) images with contrast and T2-weighted (T2-w) images based on manually annotated tumor volumes. A dedicated ML pipeline assessed the effectiveness of different radiomic features and their integration with clinical variables. The DNetPRO algorithm was used to extract signatures combining radiomic and clinical data. Results: The results showed that ML classification using both data types achieved a median Matthews Correlation Coefficient (MCC) of 0.60 ± 0.07. The best-performing DNetPRO models reached an MCC of 0.73 (T1-w + T2-w) and 0.61 (T1-w only). Key clinical features included symptoms and tumor size, while radiomic features provided additional diagnostic insights, particularly regarding gray-level distribution in T2-w and texture complexity in T1-w images. Conclusions: Despite its potential, ML-based radiomics faces challenges in clinical adoption due to data variability and model diversity. Standardization and interpretability are crucial for reliability. The DNetPRO approach helps explain feature importance and relationships, reinforcing the clinical relevance of integrating radiomic and clinical data for sinonasal tumor classification. Full article
(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)
Show Figures

Figure 1

19 pages, 3835 KB  
Article
Structured Transformation of Unstructured Prostate MRI Reports Using Large Language Models
by Luca Di Palma, Fatemeh Darvizeh, Marco Alì and Deborah Fazzini
Tomography 2025, 11(6), 69; https://doi.org/10.3390/tomography11060069 - 17 Jun 2025
Cited by 2 | Viewed by 1596
Abstract
Objectives: to assess the ability of high-performing open-weight large language models (LLMs) in extracting key radiological features from prostate MRI reports. Methods: Five LLMs (Llama3.3, DeepSeek-R1-Llama3.3, Phi4, Gemma-2, and Qwen2.5-14B) were used to analyze free-text MRI reports retrieved from clinical practice. Each LLM [...] Read more.
Objectives: to assess the ability of high-performing open-weight large language models (LLMs) in extracting key radiological features from prostate MRI reports. Methods: Five LLMs (Llama3.3, DeepSeek-R1-Llama3.3, Phi4, Gemma-2, and Qwen2.5-14B) were used to analyze free-text MRI reports retrieved from clinical practice. Each LLM processed reports three times using specialized prompts to extract (1) dimensions, (2) volume and PSA density, and (3) lesion characteristics. An experienced radiologist manually annotated the dataset, defining entities (Exam) and sub-entities (Lesion, Dimension). Feature- and physician-level performance were then assessed. Results: 250 MRI exams reported by 7 radiologists were analyzed by the LLMs. Feature-level performances showed that DeepSeek-R1-Llama3.3 exhibited the highest average score (98.6% ± 2.1%), followed by Phi4 (98.1% ± 2.2%), Llama3.3 (98.0% ± 3.0%), Qwen2.5 (97.5% ± 3.9%), and Gemma2 (96.0% ± 3.4%). All models excelled in extracting PSA density (100%) and volume (≥98.4%), while lesions’ extraction showed greater variability (88.4–94.0%). LLMs’ performance varied among radiologists: Physician B’s reports yielded the highest mean score (99.9% ± 0.2%), while Physician C’s resulted in the lowest (94.4% ± 2.3%). Conclusions: LLMs showed promising results in automated feature-extraction from radiology reports, with DeepSeek-R1-Llama3.3 achieving the highest overall score. These models can improve clinical workflows by structuring unstructured medical text. However, a preliminary analysis of reporting styles is necessary to identify potential challenges and optimize prompt design to better align with individual physician reporting styles. This approach can further enhance the robustness and adaptability of LLM-driven clinical data extraction. Full article
Show Figures

Figure 1

14 pages, 2063 KB  
Article
Deep Learning Based Automatic Ankle Tenosynovitis Quantification from MRI in Patients with Psoriatic Arthritis: A Feasibility Study
by Saeed Arbabi, Vahid Arbabi, Lorenzo Costa, Iris ten Katen, Simon C. Mastbergen, Peter R. Seevinck, Pim A. de Jong, Harrie Weinans, Mylène P. Jansen and Wouter Foppen
Diagnostics 2025, 15(12), 1469; https://doi.org/10.3390/diagnostics15121469 - 9 Jun 2025
Cited by 1 | Viewed by 1246
Abstract
Background/Objectives: Tenosynovitis is a common feature of psoriatic arthritis (PsA) and is typically assessed using semi-quantitative magnetic resonance imaging (MRI) scoring. However, visual scoring s variability. This study evaluates a fully automated, deep-learning approach for ankle tenosynovitis segmentation and volume-based quantification from MRI [...] Read more.
Background/Objectives: Tenosynovitis is a common feature of psoriatic arthritis (PsA) and is typically assessed using semi-quantitative magnetic resonance imaging (MRI) scoring. However, visual scoring s variability. This study evaluates a fully automated, deep-learning approach for ankle tenosynovitis segmentation and volume-based quantification from MRI in psoriatic arthritis (PsA) patients. Methods: We analyzed 364 ankle 3T MRI scans from 71 PsA patients. Four tenosynovitis pathologies were manually scored and used to create ground truth segmentations through a human–machine workflow. For each pathology, 30 annotated scans were used to train a deep-learning segmentation model based on the nnUNet framework, and 20 scans were used for testing, ensuring patient-level disjoint sets. Model performance was evaluated using Dice scores. Volumetric pathology measurements from test scans were compared to radiologist scores using Spearman correlation. Additionally, 218 serial MRI pairs were assessed to analyze the relationship between changes in pathology volume and changes in visual scores. Results: The segmentation model achieved promising performance on the test set, with mean Dice scores ranging from 0.84 to 0.92. Pathology volumes correlated with visual scores across all test MRIs (Spearman ρ = 0.52–0.62). Volume-based quantification captured changes in inflammation over time and identified subtle progression not reflected in semi-quantitative scores. Conclusions: Our automated segmentation tool enables fast and accurate quantification of ankle tenosynovitis in PsA patients. It may enhance sensitivity to disease progression and complement visual scoring through continuous, volume-based metrics. Full article
(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)
Show Figures

Figure 1

21 pages, 1276 KB  
Article
Quantifying Truthfulness: A Probabilistic Framework for Atomic Claim-Based Misinformation Detection
by Fahim Sufi and Musleh Alsulami
Mathematics 2025, 13(11), 1778; https://doi.org/10.3390/math13111778 - 27 May 2025
Cited by 1 | Viewed by 2491
Abstract
The increasing sophistication and volume of misinformation on digital platforms necessitate scalable, explainable, and semantically granular fact-checking systems. Existing approaches typically treat claims as indivisible units, overlooking internal contradictions and partial truths, thereby limiting their interpretability and trustworthiness. This paper addresses this gap [...] Read more.
The increasing sophistication and volume of misinformation on digital platforms necessitate scalable, explainable, and semantically granular fact-checking systems. Existing approaches typically treat claims as indivisible units, overlooking internal contradictions and partial truths, thereby limiting their interpretability and trustworthiness. This paper addresses this gap by proposing a novel probabilistic framework that decomposes complex assertions into semantically atomic claims and computes their veracity through a structured evaluation of source credibility and evidence frequency. Each atomic unit is matched against a curated corpus of 11,928 cyber-related news entries using a binary alignment function, and its truthfulness is quantified via a composite score integrating both source reliability and support density. The framework introduces multiple aggregation strategies—arithmetic and geometric means—to construct claim-level veracity indices, offering both sensitivity and robustness. Empirical evaluation across eight cyber misinformation scenarios—encompassing over 40 atomic claims—demonstrates the system’s effectiveness. The model achieves a Mean Squared Error (MSE) of 0.037, Brier Score of 0.042, and a Spearman rank correlation of 0.88 against expert annotations. When thresholded for binary classification, the system records a Precision of 0.82, Recall of 0.79, and an F1-score of 0.805. The Expected Calibration Error (ECE) of 0.068 further validates the trustworthiness of the score distributions. These results affirm the framework’s ability to deliver interpretable, statistically reliable, and operationally scalable misinformation detection, with implications for automated journalism, governmental monitoring, and AI-based verification platforms. Full article
Show Figures

Figure 1

28 pages, 1007 KB  
Article
Predicting the Event Types in the Human Brain: A Modeling Study Based on Embedding Vectors and Large-Scale Situation Type Datasets in Mandarin Chinese
by Xiaorui Ma and Hongchao Liu
Appl. Sci. 2025, 15(11), 5916; https://doi.org/10.3390/app15115916 - 24 May 2025
Viewed by 938
Abstract
Event types classify Chinese verbs based on the internal temporal structure of events. The categorization of verb event types is the most fundamental classification of concept types represented by verbs in the human brain. Meanwhile, event types exhibit strong predictive capabilities for exploring [...] Read more.
Event types classify Chinese verbs based on the internal temporal structure of events. The categorization of verb event types is the most fundamental classification of concept types represented by verbs in the human brain. Meanwhile, event types exhibit strong predictive capabilities for exploring collocational patterns between words, making them crucial for Chinese teaching. This work focuses on constructing a statistically validated gold-standard dataset, forming the foundation for achieving high accuracy in recognizing verb event types. Utilizing a manually annotated dataset of verbs and aspectual markers’ co-occurrence features, the research conducts hierarchical clustering of Chinese verbs. The resulting dendrogram indicates that verbs can be categorized into three event types—state, activity and transition—based on semantic distance. Two approaches are employed to construct vector matrices: a supervised method that derives word vectors based on linguistic features, and an unsupervised method that uses four models to extract embedding vectors, including Word2Vec, FastText, BERT and ChatGPT. The classification of verb event types is performed using three classifiers: multinomial logistic regression, support vector machines and artificial neural networks. Experimental results demonstrate the superior performance of embedding vectors. Employing the pre-trained FastText model in conjunction with an artificial neural network classifier, the model achieves an accuracy of 98.37% in predicting 3133 verbs, thereby enabling the automatic identification of event types at the level of Chinese verbs and validating the high accuracy and practical value of embedding vectors in addressing complex semantic relationships and classification tasks. This work constructs datasets of considerable semantic complexity, comprising a substantial volume of verbs along with their feature vectors and situation type labels, which can be used for evaluating large language models in the future. Full article
(This article belongs to the Special Issue Application of Artificial Intelligence and Semantic Mining Technology)
Show Figures

Figure 1

18 pages, 3451 KB  
Article
Integrating Neural Networks for Automated Video Analysis of Traffic Flow Routing and Composition at Intersections
by Maros Jakubec, Michal Cingel, Eva Lieskovská and Marek Drliciak
Sustainability 2025, 17(5), 2150; https://doi.org/10.3390/su17052150 - 2 Mar 2025
Cited by 5 | Viewed by 2398
Abstract
Traffic flow at intersections is influenced by spatial design, control methods, technical equipment, and traffic volume. This article focuses on detecting traffic flows at intersections using video recordings, employing a YOLO-based framework for automated analysis. We compare manual evaluation with machine processing to [...] Read more.
Traffic flow at intersections is influenced by spatial design, control methods, technical equipment, and traffic volume. This article focuses on detecting traffic flows at intersections using video recordings, employing a YOLO-based framework for automated analysis. We compare manual evaluation with machine processing to demonstrate the efficiency improvements in traffic engineering tasks through automated traffic data analysis. The output data include traditionally immeasurable parameters, such as speed and vehicle gaps within the observed intersection area. The traffic analysis incorporates findings from monitoring groups of vehicles, focusing on their formation and speed as they traverse the intersection. Our proposed system for monitoring and classifying traffic flow was implemented at a selected intersection in the city of Zilina, Slovak Republic, as part of a pilot study for this research initiative. Based on evaluations using local data, the YOLOv9c detection model achieved an mAP50 of 98.2% for vehicle localization and classification across three basic classes: passenger cars, trucks, and buses. Despite the high detection accuracy of the model, the automated annotations for vehicle entry and exit at the intersection showed varying levels of accuracy compared to manual evaluation. On average, the mean absolute error between annotations by traffic specialists and the automated framework for the most frequent class, passenger cars, was 2.73 across all directions at 15 min intervals. This indicates that approximately three passenger cars per 15 min interval were either undetected or misclassified. Full article
Show Figures

Figure 1

16 pages, 17731 KB  
Article
A Refined Approach to Segmenting and Quantifying Inter-Fracture Spaces in Facial Bone CT Imaging
by Doohee Lee, Kanghee Lee, Dae-Hyun Park, Gwiseong Moon, Inseo Park, Yeonjin Jeong, Kun-Yong Sung, Hyun-Soo Choi and Yoon Kim
Appl. Sci. 2025, 15(3), 1539; https://doi.org/10.3390/app15031539 - 3 Feb 2025
Viewed by 1806
Abstract
The human facial bone is made up of many complex structures, which makes it challenging to accurately analyze fractures. To address this, we developed advanced image analysis software which segments and quantifies spaces between fractured bones in facial CT images at the pixel [...] Read more.
The human facial bone is made up of many complex structures, which makes it challenging to accurately analyze fractures. To address this, we developed advanced image analysis software which segments and quantifies spaces between fractured bones in facial CT images at the pixel level. This study used 3D CT scans from 1766 patients who had facial bone fractures at a university hospital between 2014 and 2020. Our solution included a segmentation model which focuses on identifying the gaps created by facial bone fractures. However, training this model required costly pixel-level annotations. To overcome this, we used a stepwise annotation approach. First, clinical specialists marked the bounding boxes of fracture areas. Next, trained specialists created the initial pixel-level unrefined ground truth by referencing the bounding boxes. Finally, we created a refined ground truth to correct human errors, which helped improve the segmentation accuracy. Radiomics feature analysis confirmed that the refined dataset had more consistent patterns compared with the unrefined dataset, showing improved reliability. The segmentation model showed significant improvement in the Dice similarity coefficient, increasing from 0.33 with the unrefined ground truth to 0.67 with the refined ground truth. This research introduced a new method for segmenting spaces between fractured bones, allowing for precise pixel-level identification of fracture regions. The model also helped with quantitative severity assessment and enabled the creation of 3D volume renderings, which can be used in clinical settings to develop more accurate treatment plans and improve outcomes for patients with facial bone fractures. Full article
(This article belongs to the Special Issue Artificial Intelligence (AI) in Biomedical Image Processing)
Show Figures

Figure 1

17 pages, 1272 KB  
Article
Segmentation-Based Measurement of Orbital Structures: Achievements in Eyeball Volume Estimation and Barriers in Optic Nerve Analysis
by Yong Oh Lee, Hana Kim, Yeong Woong Chung, Won-Kyung Cho, Jungyul Park and Ji-Sun Paik
Diagnostics 2024, 14(23), 2643; https://doi.org/10.3390/diagnostics14232643 - 23 Nov 2024
Viewed by 1423
Abstract
Background/Objective: Orbital diseases often require precise measurements of eyeball volume, optic nerve sheath diameter (ONSD), and apex-to-eyeball distance (AED) for accurate diagnosis and treatment planning. This study aims to automate and optimize these measurements using advanced deep learning segmentation techniques on orbital Computed [...] Read more.
Background/Objective: Orbital diseases often require precise measurements of eyeball volume, optic nerve sheath diameter (ONSD), and apex-to-eyeball distance (AED) for accurate diagnosis and treatment planning. This study aims to automate and optimize these measurements using advanced deep learning segmentation techniques on orbital Computed Tomography (CT) scans. Methods: Orbital CT datasets from individuals of various age groups and genders were used, with annotated masks for the eyeball and optic nerve. A 2D attention U-Net architecture was employed for segmentation, enhanced with slice-level information embeddings to improve contextual understanding. After segmentation, the relevant metrics were calculated from the segmented structures and evaluated for clinical applicability. Results: The segmentation model demonstrated varying performance across orbital structures, achieving a Dice score of 0.8466 for the eyeball and 0.6387 for the optic nerve. Consequently, eyeball-related metrics, such as eyeball volume, exhibited high accuracy, with a root mean square error (RMSE) of 1.28–1.90 cm3 and a mean absolute percentage error (MAPE) of 12–21% across different genders and age groups. In contrast, the lower accuracy of optic nerve segmentation led to less reliable measurements of optic nerve sheath diameter (ONSD) and apex-to-eyeball distance (AED). Additionally, the study analyzed the automatically calculated measurements from various perspectives, revealing key insights and areas for improvement. Conclusions: Despite these challenges, the study highlights the potential of deep learning-based segmentation to automate the assessment of ocular structures, particularly in measuring eyeball volume, while leaving room for further improvement in optic nerve analysis. Full article
(This article belongs to the Special Issue Deep Learning in Medical Image Segmentation and Diagnosis)
Show Figures

Figure 1

16 pages, 32078 KB  
Article
Maritime Electro-Optical Image Object Matching Based on Improved YOLOv9
by Shiman Yang, Zheng Cao, Ningbo Liu, Yanli Sun and Zhongxun Wang
Electronics 2024, 13(14), 2774; https://doi.org/10.3390/electronics13142774 - 15 Jul 2024
Cited by 20 | Viewed by 2808 | Correction
Abstract
The offshore environment is complex during automatic target annotation at sea, and the difference between the focal lengths of visible and infrared sensors is large, thereby causing difficulties in matching multitarget electro-optical images at sea. This study proposes a target-matching method for visible [...] Read more.
The offshore environment is complex during automatic target annotation at sea, and the difference between the focal lengths of visible and infrared sensors is large, thereby causing difficulties in matching multitarget electro-optical images at sea. This study proposes a target-matching method for visible and infrared images at sea based on decision-level topological relations. First, YOLOv9 is used to detect targets. To obtain markedly accurate target positions to establish accurate topological relations, the YOLOv9 model is improved for its poor accuracy for small targets, high computational complexity, and difficulty in deployment. To improve the detection accuracy of small targets, an additional small target detection head is added to detect shallow feature maps. From the perspective of reducing network size and achieving lightweight deployment, the Conv module in the model is replaced with DWConv, and the RepNCSPELAN4 module in the backbone network is replaced with the C3Ghost module. The replacements significantly reduce the number of parameters and computation volume of the model while retaining the feature extraction capability of the backbone network. Experimental results of the photovoltaic dataset show that the proposed method improves detection accuracy by 8%, while the computation and number of parameters of the model are reduced by 5.7% and 44.1%, respectively. Lastly, topological relationships are established for the target results, and targets in visible and infrared images are matched based on topological similarity. Full article
(This article belongs to the Section Computer Science & Engineering)
Show Figures

Figure 1

42 pages, 9098 KB  
Review
Consequential Advancements of Self-Supervised Learning (SSL) in Deep Learning Contexts
by Mohammed Majid Abdulrazzaq, Nehad T. A. Ramaha, Alaa Ali Hameed, Mohammad Salman, Dong Keon Yon, Norma Latif Fitriyani, Muhammad Syafrudin and Seung Won Lee
Mathematics 2024, 12(5), 758; https://doi.org/10.3390/math12050758 - 3 Mar 2024
Cited by 41 | Viewed by 8077
Abstract
Self-supervised learning (SSL) is a potential deep learning (DL) technique that uses massive volumes of unlabeled data to train neural networks. SSL techniques have evolved in response to the poor classification performance of conventional and even modern machine learning (ML) and DL models [...] Read more.
Self-supervised learning (SSL) is a potential deep learning (DL) technique that uses massive volumes of unlabeled data to train neural networks. SSL techniques have evolved in response to the poor classification performance of conventional and even modern machine learning (ML) and DL models of enormous unlabeled data produced periodically in different disciplines. However, the literature does not fully address SSL’s practicalities and workabilities necessary for industrial engineering and medicine. Accordingly, this thorough review is administered to identify these prominent possibilities for prediction, focusing on industrial and medical fields. This extensive survey, with its pivotal outcomes, could support industrial engineers and medical personnel in efficiently predicting machinery faults and patients’ ailments without referring to traditional numerical models that require massive computational budgets, time, storage, and effort for data annotation. Additionally, the review’s numerous addressed ideas could encourage industry and healthcare actors to take SSL principles into an agile application to achieve precise maintenance prognostics and illness diagnosis with remarkable levels of accuracy and feasibility, simulating functional human thinking and cognition without compromising prediction efficacy. Full article
(This article belongs to the Special Issue Application of Artificial Intelligence in Decision Making)
Show Figures

Figure 1

7 pages, 1309 KB  
Proceeding Paper
Semi-Supervised Adaptation for Skeletal-Data-Based Human Action Recognition
by Haitao Tian and Pierre Payeur
Eng. Proc. 2023, 58(1), 25; https://doi.org/10.3390/ecsa-10-16083 - 15 Nov 2023
Viewed by 1238
Abstract
Recent research on human action recognition is largely facilitated by skeletal data, a compact representation composed of key joints of the human body. However, leveraging the capabilities of artificial intelligence on such sensory input imposes the collection and annotation of a large volume [...] Read more.
Recent research on human action recognition is largely facilitated by skeletal data, a compact representation composed of key joints of the human body. However, leveraging the capabilities of artificial intelligence on such sensory input imposes the collection and annotation of a large volume of skeleton data, which is extremely time consuming. In this paper, a two-phase semi-supervised learning approach is proposed to surmount the high requirements on labeled skeletal data while training a capable human action recognition model adaptive to a target environment. In the first phase, an unsupervised learning model is trained under a contrastive learning fashion to extract high-level human action semantic representations from an unlabeled source dataset. The resulting pretrained model is then fine-tuned on a small number of properly labeled data of the target environment. Experimentation is conducted on large-scale human action recognition datasets to evaluate the effectiveness of the proposed method. Full article
Show Figures

Figure 1

Back to TopTop