Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (22,827)

Search Parameters:
Keywords = image dataset

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
21 pages, 5686 KB  
Article
Analysis of Spatiotemporal Characteristics of Lightning Activity in the Beijing-Tianjin-Hebei Region Based on a Comparison of FY-4A LMI and ADTD Data
by Yahui Wang, Qiming Ma, Jiajun Song, Fang Xiao, Yimin Huang, Xiao Zhou, Xiaoyang Meng, Jiaquan Wang and Shangbo Yuan
Atmosphere 2026, 17(1), 96; https://doi.org/10.3390/atmos17010096 (registering DOI) - 16 Jan 2026
Abstract
Accurate lightning data are critical for disaster warning and climate research. This study systematically compares the Fengyun-4A Lightning Mapping Imager (FY-4A LMI) satellite and the Advanced Time-of-arrival and Direction (ADTD) lightning location network in the Beijing-Tianjin-Hebei (BTH) region (April–August, 2020–2023) using coefficient of [...] Read more.
Accurate lightning data are critical for disaster warning and climate research. This study systematically compares the Fengyun-4A Lightning Mapping Imager (FY-4A LMI) satellite and the Advanced Time-of-arrival and Direction (ADTD) lightning location network in the Beijing-Tianjin-Hebei (BTH) region (April–August, 2020–2023) using coefficient of variation (CV) analysis, Welch’s independent samples t-test, Pearson correlation analysis, and inverse distance weighting (IDW) interpolation. Key results: (1) A significant systematic discrepancy exists between the two datasets, with an annual mean ratio of 0.0636 (t = −5.1758, p < 0.01); FY-4A LMI shows higher observational stability (CV = 5.46%), while ADTD excels in capturing intense lightning events (CV = 28.01%). (2) Both datasets exhibit a consistent unimodal monthly pattern peaking in July (moderately strong positive correlation, r = 0.7354, p < 0.01) but differ distinctly in diurnal distribution. (3) High-density lightning areas of both datasets concentrate south of the Yanshan Mountains and east of the Taihang Mountains, shaped by topography and water vapor transport. This study reveals the three-factor (climatic background, topographic forcing, technical characteristics) coupled regulatory mechanism of data discrepancies and highlights the complementarity of the two datasets, providing a solid scientific basis for satellite-ground data fusion and regional lightning disaster defense. Full article
(This article belongs to the Section Atmospheric Techniques, Instruments, and Modeling)
Show Figures

Figure 1

20 pages, 1826 KB  
Article
Hybrid Underwater Image Enhancement via Dual Transmission Optimization and Transformer-Based Feature Fusion
by Ning Hu, Shuai Li and Jindong Tan
Sensors 2026, 26(2), 627; https://doi.org/10.3390/s26020627 (registering DOI) - 16 Jan 2026
Abstract
Due to complex underwater environments characterized by severe scattering, absorption, and color distortion, accurate restoration remains challenging. This paper proposes a hybrid approach combining dual transmission estimation, adaptive ambient light estimation with color correction, and a U-Net Transformer (Uformer) for underwater image enhancement. [...] Read more.
Due to complex underwater environments characterized by severe scattering, absorption, and color distortion, accurate restoration remains challenging. This paper proposes a hybrid approach combining dual transmission estimation, adaptive ambient light estimation with color correction, and a U-Net Transformer (Uformer) for underwater image enhancement. Our method estimates transmission maps by integrating boundary constraints and local contrast, which effectively address visibility degradation. An adaptive ambient light estimation and color correction strategy are further developed to correct color distortion robustly. Subsequently, a Uformer network enhances the restored image by capturing global and local contextual features effectively. Experiments conducted on publicly available underwater image datasets validate our approach. Performance is quantitatively evaluated using widely adopted non-reference image quality metrics, especially Underwater Image Quality Measure (UIQM) and Underwater Color Image Quality Evaluation (UCIQE). The results demonstrate that our proposed method achieves superior enhancement performance over several state-of-the-art methods. Full article
(This article belongs to the Section Sensing and Imaging)
22 pages, 202405 KB  
Article
Adaptive Expert Selection for Crack Segmentation Using a Top-K Mixture-of-Experts Framework with Out-of-Fold Supervision
by Ammar M. Okran, Hatem A. Rashwan, Sylvie Chambon and Domenec Puig
Electronics 2026, 15(2), 407; https://doi.org/10.3390/electronics15020407 (registering DOI) - 16 Jan 2026
Abstract
Cracks in civil infrastructure exhibit large variations in appearance due to differences in surface texture, illumination, and background clutter, making reliable segmentation a challenging task. To address this issue, this paper proposes an adaptive Mixture-of-Experts (MoE) framework that combines multiple crack segmentation models [...] Read more.
Cracks in civil infrastructure exhibit large variations in appearance due to differences in surface texture, illumination, and background clutter, making reliable segmentation a challenging task. To address this issue, this paper proposes an adaptive Mixture-of-Experts (MoE) framework that combines multiple crack segmentation models based on their estimated reliability for each input image. A lightweight gating network is trained using out-of-fold soft supervision to learn how to rank and select the most suitable experts under varying conditions. During inference, only the top two experts are combined to produce the final segmentation result. The proposed framework is evaluated on two public datasets—Crack500 and the CrackForest Dataset (CFD)—and one in-house dataset (RCFD). Experimental results demonstrate consistent improvements over individual models and recent state-of-the-art methods, achieving up to 2.4% higher IoU and 2.1% higher F1-score compared to the strongest single expert. These results show that adaptive expert selection provides an effective and practical solution for robust crack segmentation across diverse real-world scenarios. Full article
25 pages, 1708 KB  
Article
Distribution Network Electrical Equipment Defect Identification Based on Multi-Modal Image Voiceprint Data Fusion and Channel Interleaving
by An Chen, Junle Liu, Wenhao Zhang, Jiaxuan Lu, Jiamu Yang and Bin Liao
Processes 2026, 14(2), 326; https://doi.org/10.3390/pr14020326 - 16 Jan 2026
Abstract
With the explosive growth in the quantity of electrical equipment in distribution networks, traditional manual inspection struggles to achieve comprehensive coverage due to limited manpower and low efficiency. This has led to frequent equipment failures including partial discharge, insulation aging, and poor contact. [...] Read more.
With the explosive growth in the quantity of electrical equipment in distribution networks, traditional manual inspection struggles to achieve comprehensive coverage due to limited manpower and low efficiency. This has led to frequent equipment failures including partial discharge, insulation aging, and poor contact. These issues seriously compromise the safe and stable operation of distribution networks. Real-time monitoring and defect identification of their operation status are critical to ensuring the safety and stability of power systems. Currently, commonly used methods for defect identification in distribution network electrical equipment mainly rely on single-image or voiceprint data features. These methods lack consideration of the complementarity and interleaved nature between image and voiceprint features, resulting in reduced identification accuracy and reliability. To address the limitations of existing methods, this paper proposes distribution network electrical equipment defect identification based on multi-modal image voiceprint data fusion and channel interleaving. First, image and voiceprint feature models are constructed using two-dimensional principal component analysis (2DPCA) and the Mel scale, respectively. Multi-modal feature fusion is achieved using an improved transformer model that integrates intra-domain self-attention units and an inter-domain cross-attention mechanism. Second, an image and voiceprint multi-channel interleaving model is applied. It combines channel adaptability and confidence to dynamically adjust weights and generates defect identification results using a weighting approach based on output probability information content. Finally, simulation results show that, under the dataset size of 3300 samples, the proposed algorithm achieves a 8.96–33.27% improvement in defect recognition accuracy compared with baseline algorithms, and maintains an accuracy of over 86.5% even under 20% random noise interference by using improved transformer and multi-channel interleaving mechanism, verifying its advantages in accuracy and noise robustness. Full article
29 pages, 55768 KB  
Article
Distributed Artificial Intelligence for Organizational and Behavioral Recognition of Bees and Ants
by Apolinar Velarde Martinez and Gilberto Gonzalez Rodriguez
Sensors 2026, 26(2), 622; https://doi.org/10.3390/s26020622 - 16 Jan 2026
Abstract
Scientific studies have demonstrated how certain insect species can be used as bioindicators and reverse environmental degradation through their behavior and organization. Studying these species involves capturing and extracting hundreds of insects from a colony for subsequent study, analysis, and observation. This allows [...] Read more.
Scientific studies have demonstrated how certain insect species can be used as bioindicators and reverse environmental degradation through their behavior and organization. Studying these species involves capturing and extracting hundreds of insects from a colony for subsequent study, analysis, and observation. This allows researchers to classify the individuals and also determine the organizational structure and behavioral patterns of the insects within colonies. The miniaturization of hardware devices for data and image acquisition, coupled with new Artificial Intelligence techniques such as Scene Graph Generation (SGG), has evolved from the detection and recognition of objects in an image to the understanding of relationships between objects and the ability to produce textual descriptions based on image content and environmental parameters. This research paper presents the design and functionality of a distributed computing architecture for image and video acquisition of bees and ants in their natural environment, in addition to a parallel computing architecture that hosts two datasets with images of real environments from which scene graphs are generated to recognize, classify, and analyze the behaviors of bees and ants while preserving and protecting these species. The experiments that were carried out are classified into two categories, namely the recognition and classification of objects in the image and the understanding of the relationships between objects and the generation of textual descriptions of the images. The results of the experiments, conducted in real-life environments, show recognition rates above 70%, classification rates above 80%, and comprehension and generation of textual descriptions with an assertive rate of 85%. Full article
27 pages, 4956 KB  
Article
StaticPigDetv2: Performance Improvement of Unseen Pig Monitoring Environment Using Depth-Based Background and Facility Information
by Seungwook Son, Munki Park, Sejun Lee, Jongwoong Seo, Seunghyun Yu, Daihee Park and Yongwha Chung
Sensors 2026, 26(2), 621; https://doi.org/10.3390/s26020621 - 16 Jan 2026
Abstract
Standard Deep Learning-based detectors generally face a trade-off between accuracy and latency, as well as a significant performance degradation when applied to unseen environments. To address these challenges, this study proposes a method that enhances both accuracy and latency by leveraging the static [...] Read more.
Standard Deep Learning-based detectors generally face a trade-off between accuracy and latency, as well as a significant performance degradation when applied to unseen environments. To address these challenges, this study proposes a method that enhances both accuracy and latency by leveraging the static characteristics of fixed-camera pig pen monitoring. Specifically, we utilize background and infrastructure information obtained through a one-time preprocessing step upon camera installation. By integrating this information, we introduce three distinct modules, Background-suppressed Image Generator (BIG), Facility Image Generator (FIG), and Background Suppression Integration (BSI), that improve detection accuracy and operational efficiency without the need for model retraining. BIG creates background-suppressed images that integrate foreground and background information. FIG creates facility mask images that can be used to identify pigs that are occluded by facilities, enabling more efficient learning in unseen environments. BSI leverages both the input image and the background-suppressed image generated by BIG, feeding them into a 3D convolution layer for efficient feature fusion. This difference-aware fusion helps the model focus on foreground information and gradually reduce the domain gap. After training on the German pig dataset and testing on the unseen Korean Hadong pig dataset, the proposed method could improve AP50 accuracy (from 75% to 86%) and Jetson Orin Nano latency (from 67 ms to 41 ms) compared to the baseline model YOLOV12m. Full article
(This article belongs to the Special Issue Smart Decision Systems for Digital Farming: 2nd Edition)
18 pages, 1623 KB  
Review
AI Chatbots and Remote Sensing Archaeology: Current Landscape, Technical Barriers, and Future Directions
by Nicolas Melillos and Athos Agapiou
Heritage 2026, 9(1), 32; https://doi.org/10.3390/heritage9010032 - 16 Jan 2026
Abstract
Chatbots have emerged as a promising interface for facilitating access to complex datasets, allowing users to pose questions in natural language rather than relying on specialized technical workflows. At the same time, remote sensing has transformed archaeological practice by producing vast amounts of [...] Read more.
Chatbots have emerged as a promising interface for facilitating access to complex datasets, allowing users to pose questions in natural language rather than relying on specialized technical workflows. At the same time, remote sensing has transformed archaeological practice by producing vast amounts of imagery from LiDAR, drones, and satellites. While these advances have created unprecedented opportunities for discovery, they also pose significant challenges due to the scale, heterogeneity, and interpretative demands of the data. In related scientific domains, multimodal conversational systems capable of integrating natural language interaction with image-based analysis have advanced rapidly, supported by a growing body of survey and review literature documenting their architectures, datasets, and applications across multiple fields. By contrast, archaeological applications of chatbots remain limited to text-based prototypes, primarily focused on education, cultural heritage mediation or archival search. This review synthesizes the historical development of chatbots, examines their current use in remote sensing, and evaluates the barriers to adapting such systems for archaeology. Four major challenges are identified: data scale and heterogeneity, scarcity of training datasets, computational costs, and uncertainties around usability and adoption. By comparing experiences across domains, this review highlights both the opportunities and the limitations of integrating conversational AI into archaeological workflows. The central conclusion is that domain-specific adaptation is essential if multimodal chatbots are to become effective analytical partners in archaeology. Full article
(This article belongs to the Section Digital Heritage)
Show Figures

Figure 1

23 pages, 8155 KB  
Article
MRMAFusion: A Multi-Scale Restormer and Multi-Dimensional Attention Network for Infrared and Visible Image Fusion
by Liang Dong, Guiling Sun, Haicheng Zhang and Wenxuan Luo
Appl. Sci. 2026, 16(2), 946; https://doi.org/10.3390/app16020946 - 16 Jan 2026
Abstract
Infrared and visible image fusion improves the visual representation of scenes. Current deep learning-based fusion methods typically rely on either convolution operations for local feature extraction or Transformers for global feature extraction, often neglecting the contribution of multi-scale features to fusion performance. To [...] Read more.
Infrared and visible image fusion improves the visual representation of scenes. Current deep learning-based fusion methods typically rely on either convolution operations for local feature extraction or Transformers for global feature extraction, often neglecting the contribution of multi-scale features to fusion performance. To address this limitation, we propose MRMAFusion, a nested connection model that relies on the multi-scale restoration-Transformer (Restormer) and multi-dimensional attention. We construct an encoder–decoder architecture on UNet++ network with multi-scale local and global feature extraction using convolution blocks and Restormer. Restormer can provide global dependency and more comprehensive attention to texture details of the target region along the vertical dimension, compared to extracting features by convolution operations. Along the horizontal dimension, we enhance MRMAFusion’s multi-scale feature extraction and reconstruction capability by incorporating multi-dimensional attention into the encoder’s convolutional blocks. We perform extensive experiments on the public datasets TNO, NIR and RoadScene and compare with other state-of-the-art methods for both objective and subjective evaluation. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
18 pages, 6228 KB  
Article
All-Weather Flood Mapping Using a Synergistic Multi-Sensor Downscaling Framework: Case Study for Brisbane, Australia
by Chloe Campo, Paolo Tamagnone, Suelynn Choy, Trinh Duc Tran, Guy J.-P. Schumann and Yuriy Kuleshov
Remote Sens. 2026, 18(2), 303; https://doi.org/10.3390/rs18020303 - 16 Jan 2026
Abstract
Despite a growing number of Earth Observation satellites, a critical observational gap persists for timely, high-resolution flood mapping, primarily due to infrequent satellite revisits and persistent cloud cover. To address this issue, we propose a novel framework that synergistically fuses complementary data from [...] Read more.
Despite a growing number of Earth Observation satellites, a critical observational gap persists for timely, high-resolution flood mapping, primarily due to infrequent satellite revisits and persistent cloud cover. To address this issue, we propose a novel framework that synergistically fuses complementary data from three public sensor types. Our methodology harmonizes these disparate data sources by using surface water fraction as a common variable and downscaling them with flood susceptibility and topography information. This allows for the integration of sub-daily observations from the Visible Infrared Imaging Radiometer Suite and the Advanced Himawari Imager with the cloud-penetrating capabilities of the Advanced Microwave Scanning Radiometer 2. We evaluated this approach on the February 2022 flood in Brisbane, Australia using an independent ground truth dataset. The framework successfully compensates for the limitations of individual sensors, enabling the consistent generation of detailed, high-resolution flood maps. The proposed method outperformed the flood extent derived from commercial high-resolution optical imagery, scoring 77% higher than the Copernicus Emergency Management Service (CEMS) map in the Critical Success Index. Furthermore, the True Positive Rate was twice as high as the CEMS map, confirming that the proposed method successfully overcame the cloud cover issue. This approach provides valuable, actionable insights into inundation dynamics, particularly when other public data sources are unavailable. Full article
23 pages, 2725 KB  
Article
Text- and Face-Conditioned Multi-Anchor Conditional Embedding for Robust Periocular Recognition
by Po-Ling Fong, Tiong-Sik Ng and Andrew Beng Jin Teoh
Appl. Sci. 2026, 16(2), 942; https://doi.org/10.3390/app16020942 - 16 Jan 2026
Abstract
Periocular recognition is essential when full-face images cannot be used because of occlusion, privacy constraints, or sensor limitations, yet in many deployments, only periocular images are available at run time, while richer evidence, such as archival face photos and textual metadata, exists offline. [...] Read more.
Periocular recognition is essential when full-face images cannot be used because of occlusion, privacy constraints, or sensor limitations, yet in many deployments, only periocular images are available at run time, while richer evidence, such as archival face photos and textual metadata, exists offline. This mismatch makes it hard to deploy conventional multimodal fusion. This motivates the notion of conditional biometrics, where auxiliary modalities are used only during training to learn stronger periocular representations while keeping deployment strictly periocular-only. In this paper, we propose Multi-Anchor Conditional Periocular Embedding (MACPE), which maps periocular, facial, and textual features into a shared anchor-conditioned space via a learnable anchor bank that preserves periocular micro-textures while aligning higher-level semantics. Training combines identity classification losses on periocular and face branches with a symmetric InfoNCE loss over anchors and a pulling regularizer that jointly aligns periocular, facial, and textual embeddings without collapsing into face-dominated solutions; captions generated by a vision language model provide complementary semantic supervision. At deployment, only the periocular encoder is used. Experiments across five periocular datasets show that MACPE consistently improves Rank-1 identification and reduces EER at a fixed FAR compared with periocular-only baselines and alternative conditioning methods. Ablation studies verify the contributions of anchor-conditioned embeddings, textual supervision, and the proposed loss design. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

18 pages, 2868 KB  
Article
AdaDenseNet-LUC: Adaptive Attention DenseNet for Laryngeal Ultrasound Image Classification
by Cunyuan Luan and Huabo Liu
BioMedInformatics 2026, 6(1), 5; https://doi.org/10.3390/biomedinformatics6010005 - 16 Jan 2026
Abstract
Evaluating the difficulty of endotracheal intubation during pre-anesthesia assessment has consistently posed a challenge for clinicians. Accurate prediction of intubation difficulty is crucial for subsequent treatment planning. However, existing diagnostic methods often suffer from low accuracy. To tackle this issue, this study presented [...] Read more.
Evaluating the difficulty of endotracheal intubation during pre-anesthesia assessment has consistently posed a challenge for clinicians. Accurate prediction of intubation difficulty is crucial for subsequent treatment planning. However, existing diagnostic methods often suffer from low accuracy. To tackle this issue, this study presented an automated airway classification method utilizing Convolutional Neural Networks (CNNs). We proposed Adaptive Attention DenseNet for Laryngeal Ultrasound Classification (AdaDenseNet-LUC), a network architecture that enhances classification performance by integrating an adaptive attention mechanism into DenseNet (Dense Convolutional Network), enabling the extraction of deep features that aid in difficult airway classification. This model associates laryngeal ultrasound images with actual intubation difficulty, providing healthcare professionals with scientific evidence to help improve the accuracy of clinical decision-making. Experiments were performed on a dataset of 1391 ultrasound images, utilizing 5-fold cross-validation to assess the model’s performance. The experimental results show that the proposed method achieves a classification accuracy of 87.41%, sensitivity of 86.05%, specificity of 88.59%, F1 score of 0.8638, and AUC of 0.94. Grad-CAM visualization techniques indicate that the model’s attention is attention to the tracheal region. The results demonstrate that the proposed method outperforms current approaches, delivering objective and accurate airway classification outcomes, which serve as a valuable reference for evaluating the difficulty of endotracheal intubation and providing guidance for clinicians. Full article
Show Figures

Figure 1

37 pages, 4259 KB  
Article
Image-Based Segmentation of Hydrogen Bubbles in Alkaline Electrolysis: A Comparison Between Ilastik and U-Net
by José Pereira, Reinaldo Souza, Arthur Normand and Ana Moita
Algorithms 2026, 19(1), 77; https://doi.org/10.3390/a19010077 - 16 Jan 2026
Abstract
This study aims to enhance the efficiency of hydrogen production through alkaline water electrolysis by analyzing hydrogen bubble dynamics using high-speed image processing and machine learning algorithms. The experiments were conducted to evaluate the effects of electrical current and ultrasound oscillations on the [...] Read more.
This study aims to enhance the efficiency of hydrogen production through alkaline water electrolysis by analyzing hydrogen bubble dynamics using high-speed image processing and machine learning algorithms. The experiments were conducted to evaluate the effects of electrical current and ultrasound oscillations on the system performance. The bubble formation and detachment process were recorded and analyzed using two segmentation models: Ilastik, a GUI-based tool, and U-Net, a deep learning convolutional network implemented in PyTorch. v. 2.9.0. Both models were trained on a dataset of 24 images under varying experimental conditions. The evaluation metrics included Intersection over Union (IoU), Root Mean Square Error (RMSE), and bubble diameter distribution. Ilastik achieved better accuracy and lower RMSE, while U-Net. U-Net offered higher scalability and integration flexibility within Python environments. Both models faced challenges when detecting small bubbles and under complex lighting conditions. Improvements such as expanding the training dataset, increasing image resolution, and adopting patch-based processing were proposed. Overall, the result demonstrates the automated image segmentation can provide reliable bubble characterization, contributing to the optimization of electrolysis-based hydrogen production. Full article
Show Figures

Figure 1

41 pages, 2388 KB  
Article
Comparative Epidemiology of Machine and Deep Learning Diagnostics in Diabetes and Sickle Cell Disease: Africa’s Challenges, Global Non-Communicable Disease Opportunities
by Oluwafisayo Babatope Ayoade, Seyed Shahrestani and Chun Ruan
Electronics 2026, 15(2), 394; https://doi.org/10.3390/electronics15020394 - 16 Jan 2026
Abstract
Non-communicable diseases (NCDs) such as Diabetes Mellitus (DM) and Sickle Cell Disease (SCD) pose an escalating health challenge in Africa, underscored by diagnostic deficiencies, inadequate surveillance, and limited health system capacity that contribute to late diagnoses and consequent preventable complications. This review adopts [...] Read more.
Non-communicable diseases (NCDs) such as Diabetes Mellitus (DM) and Sickle Cell Disease (SCD) pose an escalating health challenge in Africa, underscored by diagnostic deficiencies, inadequate surveillance, and limited health system capacity that contribute to late diagnoses and consequent preventable complications. This review adopts a comparative framework that considers DM and SCD as complementary indicator diseases, both metabolic and genetic, and highlights intersecting diagnostic, infrastructural, and governance hurdles relevant to AI-enabled screening in resource-constrained environments. The study synthesizes epidemiological data across both African and high-income regions and methodically catalogs machine learning (ML) and deep learning (DL) research by clinical application, including risk prediction, image-based diagnostics, remote patient monitoring, privacy-preserving learning, and governance frameworks. Our key observations reveal significant disparities in disease detection and health outcomes, driven by underdiagnosis, a lack of comprehensive newborn screening for SCD, and fragmented diabetes surveillance systems in Africa, despite the availability of effective diagnostic technologies in other regions. The reviewed literature on ML/DL shows high algorithmic accuracy, particularly in diabetic retinopathy screening and emerging applications in SCD microscopy. However, most studies are constrained by small, single-site datasets that lack robust external validation and do not align well with real-world clinical workflows. The review identifies persistent implementation challenges, including data scarcity, device variability, limited connectivity, and inadequate calibration and subgroup analysis. By integrating epidemiological insights into AI diagnostic capabilities and health system realities, this work extends beyond earlier surveys to offer a comprehensive, Africa-centric, implementation-focused synthesis. It proposes actionable operational and policy recommendations, including offline-first deployment strategies, federated learning approaches for low-bandwidth scenarios, integration with primary care and newborn screening initiatives, and enhanced governance structures, to promote equitable and scalable AI-enhanced diagnostics for NCDs. Full article
(This article belongs to the Special Issue Machine Learning Approach for Prediction: Cross-Domain Applications)
Show Figures

Figure 1

16 pages, 10758 KB  
Article
Content-Preserving Image Style Transfer via Reversible Networks with Meta ActNorm
by Yang-Ta Kao, Hwei Jen Lin, Kai-Jun Lin and Yoshimasa Tokuyama
Electronics 2026, 15(2), 395; https://doi.org/10.3390/electronics15020395 - 16 Jan 2026
Abstract
Image style transfer aims to synthesize visually compelling images by blending the structural content of one image with the artistic style of another. While arbitrary style transfer methods such as AdaIN and WCT offer flexibility, they often suffer from content distortion and style [...] Read more.
Image style transfer aims to synthesize visually compelling images by blending the structural content of one image with the artistic style of another. While arbitrary style transfer methods such as AdaIN and WCT offer flexibility, they often suffer from content distortion and style leakage, particularly in complex or cross-domain scenarios. Recent approaches like ArtFlow address these issues through reversible architectures, effectively reducing distortion and leakage while providing consistent reconstruction. However, ArtFlow’s reliance on fixed normalization parameters limits adaptability across diverse content–style pairs, motivating further improvement. In this paper, we propose ISTMAF (Image Style Transfer based on Meta ArtFlow), a scalable and adaptive reversible framework that incorporates Meta ActNorm—a meta-network that dynamically generates input-specific normalization parameters. To further improve the integration of content and style, we introduce an algebraic–geometric parameter fusion strategy in the reverse process, along with a hierarchical aligned style loss to reduce artifacts and enhance visual coherence. Experiments on MS-COCO, WikiArt, and face datasets demonstrate that ISTMAF achieves superior content preservation and style consistency compared to recent state-of-the-art methods. Quantitative evaluations using SSIM and Gram difference further confirm its effectiveness. ISTMAF provides a flexible, high-fidelity solution for style transfer and shows strong generalization potential, paving the way for future extensions in multi-style fusion, video stylization, and 3D applications. Full article
Show Figures

Figure 1

18 pages, 3091 KB  
Article
Automated Detection of Malaria (Plasmodium) Parasites in Images Captured with Mobile Phones Using Convolutional Neural Networks
by Jhosephi Vásquez Ascate, Bill Bardales Layche, Rodolfo Cardenas Vigo, Erwin Dianderas Caut, Carlos Ramírez Calderón, Carlos Garcia Cortegano, Alejandro Reategui Pezo, Katty Arista Flores, Juan Ramírez Calderón, Cristiam Carey Angeles, Karine Zevallos Villegas, Martin Casapia Morales and Hugo Rodríguez Ferrucci
Appl. Sci. 2026, 16(2), 927; https://doi.org/10.3390/app16020927 - 16 Jan 2026
Abstract
Microscopic examination of Giemsa-stained thick blood smears remains the reference standard for malaria diagnosis, but it requires specialized personnel and is difficult to scale in resource-limited settings. We present a lightweight, smartphone-based system for automatic detection of Plasmodium parasites in thick smears captured [...] Read more.
Microscopic examination of Giemsa-stained thick blood smears remains the reference standard for malaria diagnosis, but it requires specialized personnel and is difficult to scale in resource-limited settings. We present a lightweight, smartphone-based system for automatic detection of Plasmodium parasites in thick smears captured with mobile phones attached to a conventional microscope. We built a clinically validated dataset of 400 slides from Loreto, Peru, consisting of 8625 images acquired with three smartphone models and 54,531 annotated instances of Plasmodium vivax and P. falciparum across eight morphologic classes. The workflow includes YOLOv11n-based visual-field segmentation, rescaling, tiling into 640 × 640 patches, data augmentation, and parasite detection. Four lightweight detectors were evaluated; YOLOv11n achieved the best trade-off, with an F1-score of 0.938 and an overall accuracy of 90.92% on the test subset. For diagnostic interpretability, performance was also assessed at the visual-field level by grouping detections into Vivax, Falciparum, Mixed, and Background. On a high-end smartphone (Samsung Galaxy S24 Ultra), the deployed YOLOv11n model achieved 110.9 ms latency per 640 × 640 inference (9.02 FPS). Full article
(This article belongs to the Section Applied Biosciences and Bioengineering)
Show Figures

Figure 1

Back to TopTop