Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (5,104)

Search Parameters:
Keywords = ResNet164

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
6865 KB  
Proceeding Paper
Evaluating Semantic Segmentation Performance Using DeepLabv3+ with Pretrained ResNet Backbones and Multi-Class Annotations
by Matej Spajić, Marija Habijan, Danijel Marinčić and Irena Galić
Eng. Proc. 2026, 125(1), 23; https://doi.org/10.3390/engproc2026125023 - 16 Feb 2026
Abstract
Semantic segmentation is a critical task in computer vision, enabling dense classification of image regions. This work investigates the effectiveness of the DeepLabv3+ architecture for binary semantic segmentation using annotated image data. A pretrained ResNet-101 backbone is employed to extract deep features, while [...] Read more.
Semantic segmentation is a critical task in computer vision, enabling dense classification of image regions. This work investigates the effectiveness of the DeepLabv3+ architecture for binary semantic segmentation using annotated image data. A pretrained ResNet-101 backbone is employed to extract deep features, while Atrous Spatial Pyramid Pooling (ASPP) and a decoder module refine the segmentation outputs. The dataset provides per-image annotations indicating class presence, which are leveraged to approximate segmentation masks for training purposes. Various data augmentation techniques and training strategies were applied to support effective learning and reduce overfitting. Experimental results on the MHIST dataset show that the proposed pipeline achieves strong performance despite the lack of pixel-level annotations, with a mean Intersection-over-Union (mIoU) of 0.76 and a mean Dice coefficient of 0.84. These confirm the potential of weakly supervised segmentation using class-aware CAMs and deep pretrained encoders for structured pixel-level prediction tasks in medical imaging. Full article
Show Figures

Figure 1

16 pages, 3335 KB  
Article
A Robust mmWave Radar Framework for Accurate People Counting and Motion Classification
by Nuobei Zhang, Haoxuan Li, Adnan Zahid, Yue Tian and Wenda Li
Sensors 2026, 26(4), 1289; https://doi.org/10.3390/s26041289 - 16 Feb 2026
Abstract
People counting and occupancy monitoring play a vital role in applications such as intelligent building management, safety control, and resource optimization in future smart cities. Conventional camera and infrared-based methods often suffer from privacy risks, lighting dependency, and limited robustness in complex indoor [...] Read more.
People counting and occupancy monitoring play a vital role in applications such as intelligent building management, safety control, and resource optimization in future smart cities. Conventional camera and infrared-based methods often suffer from privacy risks, lighting dependency, and limited robustness in complex indoor environments. In this paper, we present a 60 GHz millimeter-wave (mmWave) radar-based occupancy monitoring system that enables accurate and privacy-preserving people counting. The proposed system leverages echo signals processed through Doppler and range spectrogram and analyzed by an enhanced ResNet-50 deep learning model to classify motion states and count individuals. Experimental results collected in a typical indoor environment demonstrate that the system achieves 95.45% accuracy across 6 classes of movements and 98.86% accuracy for people counting (0–3 persons). The method also shows strong adaptability under limited data and robustness to Gaussian blur interference, providing an efficient and reliable solution for intelligent indoor occupancy monitoring. Full article
Show Figures

Figure 1

24 pages, 1870 KB  
Article
Class Imbalance-Aware Deep Learning Approach for Apple Leaf Disease Recognition
by Emrah Fidan, Serra Aksoy, Pinar Demircioglu and Ismail Bogrekci
AgriEngineering 2026, 8(2), 70; https://doi.org/10.3390/agriengineering8020070 - 16 Feb 2026
Abstract
Apple leaf disease identification with high precision is one of the main challenges in precision agriculture. The datasets usually have class imbalance problems and environmental changes, which negatively impact deep learning approaches. In this paper, an ablation study is proposed to test three [...] Read more.
Apple leaf disease identification with high precision is one of the main challenges in precision agriculture. The datasets usually have class imbalance problems and environmental changes, which negatively impact deep learning approaches. In this paper, an ablation study is proposed to test three different scenarios: V1, a hybrid balanced dataset consisting of 10,028 images; V2, an imbalanced dataset as a baseline consisting of 14,582 original images; and V3, a 3× physical augmentation approach based on the 14,582 images. The classification performance of YOLOv11x was benchmarked against three state-of-the-art CNN architectures: ResNet-152, DenseNet-201, and EfficientNet-B1. The methodology incorporates controlled downsampling for dominant classes alongside scenario-based augmentation for minority classes, utilizing CLAHE-based texture enhancement, illumination simulation, and sensor noise generation. All the models were trained for up to 100 epochs under identical experimental conditions, with early stopping based on validation performance and an 80/20 train-validation split. The experimental results demonstrate that the impact of balancing strategies is model-dependent and does not universally improve performance. This highlights the importance of aligning data balancing strategies with architectural characteristics rather than applying uniform resampling approaches. YOLOv11x achieved its peak accuracy of 99.18% within the V3 configuration, marking a +0.62% improvement over the V2 baseline (99.01%). In contrast, EfficientNet-B1 reached its optimal performance in the V2 configuration (98.43%) without additional intervention. While all the models exhibited consistently high AUC values (≥99.94%), DenseNet-201 achieved the highest value (99.97%) across both V2 and V3 configurations. In fine-grained discrimination, the superior performance of YOLOv11x on challenging cases is verified, with only one incorrect classification (Rust to Scab), while ResNet-152 and DenseNet-201 incorrectly classified eight and seven samples, respectively. Degradation sensitivity analysis under controlled Gaussian noise and motion blur indicated that CNN baseline models maintained stable performance. High minority-class reliability, including a 96.20% F1-score for Grey Spot and 100% precision for Mosaic, further demonstrates effective fine-grained discrimination. Results indicate that data preservation with physically inspired augmentation (V3) is better than resampling-based balancing (V1), especially in terms of global accuracy and minority-class performance. Full article
31 pages, 5533 KB  
Article
Comparative Evaluation of Fusion Strategies Using Multi-Pretrained Deep Learning Fusion-Based (MPDLF) Model for Histopathology Image Classification
by Fatma Alshohoumi and Abdullah Al-Hamdani
Appl. Sci. 2026, 16(4), 1964; https://doi.org/10.3390/app16041964 - 16 Feb 2026
Abstract
Histopathological image analysis remains the cornerstone of cancer diagnosis; however, manual assessment is challenged by stain variability, differences in imaging magnification, and complex morphological patterns. The proposed multi-pretrained deep learning fusion (MPDLF) approach combines two widely used CNN architectures: ResNet50, which captures deeper [...] Read more.
Histopathological image analysis remains the cornerstone of cancer diagnosis; however, manual assessment is challenged by stain variability, differences in imaging magnification, and complex morphological patterns. The proposed multi-pretrained deep learning fusion (MPDLF) approach combines two widely used CNN architectures: ResNet50, which captures deeper semantic representations, and VGG16, which extracts fine-grained details. This work differs from previous fusion studies by providing a controlled evaluation of early, intermediate, and late fusion for integrating two pretrained CNN backbones (ResNet50 and VGG16) under single-modality histopathology constraints. To isolate the fusion effect, identical training settings are used across three public H&E datasets. Early fusion achieved the best test performance for the two primary tasks reported here: breast cancer binary classification (accuracy = 0.9070, 95% CI: 0.8742–0.9404; AUC = 0.9707, 95% CI: 0.9541–0.9844) and renal clear cell carcinoma (RCCC) five-class grading (accuracy = 0.8792, 95% CI: 0.8529–0.9041; AUC (OvR, macro) = 0.9895, 95% CI: 0.9859–0.9927). Future work will extend these experiments to additional magnification levels (100×, 200×, and 400×) for breast cancer histopathology images and explore advanced hybrid fusion strategies across different histopathology datasets. Full article
(This article belongs to the Special Issue AI for Medical Systems: Algorithms, Applications, and Challenges)
Show Figures

Figure 1

24 pages, 16509 KB  
Article
Lithology Identification via MSC-Transformer Network with Time-Frequency Feature Fusion
by Shiyi Xu, Sheng Wang, Jun Bai, Kun Lai, Jie Zhang, Qingfeng Wang and Jie Zhang
Appl. Sci. 2026, 16(4), 1949; https://doi.org/10.3390/app16041949 - 15 Feb 2026
Viewed by 68
Abstract
Real-time lithology identification during drilling faces challenges such as indistinct boundaries and difficulties in feature extraction. To address these, this study proposes the MSC-Transformer, a novel model integrating time-frequency features with a deep neural network. A series of drilling experiments were conducted using [...] Read more.
Real-time lithology identification during drilling faces challenges such as indistinct boundaries and difficulties in feature extraction. To address these, this study proposes the MSC-Transformer, a novel model integrating time-frequency features with a deep neural network. A series of drilling experiments were conducted using an intelligent drilling platform, during which triaxial vibration signals were collected from five types of rock specimens: anthracite, granite, bituminous coal, sandstone, and shale. Short-time Fourier Transform (STFT) was applied to generate multi-channel power spectral density (PSD) maps, which were then fused into a three-channel tensor to preserve directional frequency information and used as inputs to the model. The proposed MSC-Transformer combines a multi-scale convolutional (MSC) module with a lightweight Transformer encoder to jointly capture local texture patterns and global dependency features, thereby enabling accurate classification of complex lithologies. Experimental results demonstrate that the model achieves an average accuracy of 98.21 ± 0.49% on the test set, outperforming convolutional neural networks (CNNs), visual geometry group (VGG), residual network (ResNet), and bidirectional long short-term memory (Bi-LSTM) by 5.93 ± 0.90%, 2.54 ± 1.11%, 6.38 ± 2.63%, and 10.56 ± 3.11%, respectively, with statistically significant improvements (p < 0.05). Ablation studies and visualization analyses further validate the effectiveness and interpretability of the model architecture. These findings indicate that lithology recognition based on time-frequency representations of vibration signals is both stable and generalizable, offering technical support for real-time intelligent lithology identification during drilling operations. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
20 pages, 13497 KB  
Article
Road Slippery State-Aware Adaptive Collision Warning Method for IVs
by Ying Cheng, Yu Zhang, Mingjiang Cai and Wei Luo
Electronics 2026, 15(4), 829; https://doi.org/10.3390/electronics15040829 - 14 Feb 2026
Viewed by 57
Abstract
To address critical limitations in conventional forward collision warning (FCW) systems including inadequate road condition detection accuracy, significant warning area prediction errors, and poor environmental adaptability on wet/snow-covered roads, this study develops an adaptive collision warning framework based on real-time road slippery states [...] Read more.
To address critical limitations in conventional forward collision warning (FCW) systems including inadequate road condition detection accuracy, significant warning area prediction errors, and poor environmental adaptability on wet/snow-covered roads, this study develops an adaptive collision warning framework based on real-time road slippery states recognition. An enhanced ED-ResNet50 model is proposed, incorporating grouped convolutions within the backbone network and embedding ECA attention mechanisms after the second/third residual blocks alongside DDS-DA modules after the fourth block, significantly improving discriminative capability for pavement texture analysis under adverse conditions. This vision-based recognition system synchronizes with YOLOv8 for preceding vehicle detection, enabling the construction of a friction-sensitive safety distance and the time-to-collision model that dynamically calibrates warning thresholds according to instantaneous vehicle velocity and road adhesion coefficients. Real-vehicle validation demonstrates an 8.76% improvement in overall warning accuracy and 7.29% reduction in lateral and early false alarm rates compared to static-threshold systems, confirming practical efficacy for safety assurance in inclement weather. Full article
(This article belongs to the Special Issue Signal Processing and AI Applications for Vehicles, 2nd Edition)
Show Figures

Figure 1

16 pages, 3267 KB  
Article
Machine Learning-Based Ear Thermal Imaging for Emotion Sensing
by Budu Tang and Wataru Sato
Sensors 2026, 26(4), 1248; https://doi.org/10.3390/s26041248 - 14 Feb 2026
Viewed by 59
Abstract
Thermal imaging, which is contact-free, light-independent, and effective in detecting skin temperature changes that reflect autonomic nervous system activity, is expected to be useful for emotion sensing. A recent thermography study demonstrated a linear relationship between ear temperatures and emotional arousal ratings. However, [...] Read more.
Thermal imaging, which is contact-free, light-independent, and effective in detecting skin temperature changes that reflect autonomic nervous system activity, is expected to be useful for emotion sensing. A recent thermography study demonstrated a linear relationship between ear temperatures and emotional arousal ratings. However, whether and how ear thermal changes may be nonlinearly related to subjective emotions remains untested. To address this issue, we reanalyzed a dataset that included ear thermal images and self-reported arousal ratings obtained while participants watched emotion-eliciting films. We employed linear regression and two nonlinear machine learning models: a random forest model and a ResNet-50 convolutional neural network. Model evaluation using mean squared error and correlation coefficients between actual arousal ratings and model predictions indicated that both machine learning models outperformed linear regression and that the ResNet-50 model outperformed the random forest model. Interpretation of the ResNet-50 model using Gradient-weighted Class Activation Mapping and Shapley additive explanation methods revealed nonlinear associations between temperature changes in specific ear regions and subjective arousal ratings. These findings imply that ear thermal imaging combined with machine learning, particularly deep learning, holds promise for emotion sensing. Full article
(This article belongs to the Special Issue Emotion Recognition Based on Sensors (3rd Edition))
Show Figures

Figure 1

40 pages, 10956 KB  
Article
Automatic Childhood Pneumonia Diagnosis Based on Multi-Model Feature Fusion Using Chi-Square Feature Selection
by Amira Ouerhani, Tareq Hadidi, Hanene Sahli and Halima Mahjoubi
J. Imaging 2026, 12(2), 81; https://doi.org/10.3390/jimaging12020081 - 14 Feb 2026
Viewed by 56
Abstract
Pneumonia is one of the main reasons for child mortality, with chest radiography (CXR) being essential for its diagnosis. However, the low radiation exposure in pediatric analysis complicates the accurate detection of pneumonia, making traditional examination ineffective. Progress in medical imaging with convolutional [...] Read more.
Pneumonia is one of the main reasons for child mortality, with chest radiography (CXR) being essential for its diagnosis. However, the low radiation exposure in pediatric analysis complicates the accurate detection of pneumonia, making traditional examination ineffective. Progress in medical imaging with convolutional neural networks (CNN) has considerably improved performance, gaining widespread recognition for its effectiveness. This paper proposes an accurate pneumonia detection method based on different deep CNN architectures that combine optimal feature fusion. Enhanced VGG-19, ResNet-50, and MobileNet-V2 are trained on the most widely used pneumonia dataset, applying appropriate transfer learning and fine-tuning strategies. To create an effective feature input, the Chi-Square technique removes inappropriate features from every enhanced CNN. The resulting subsets are subsequently fused horizontally, to generate more diverse and robust feature representation for binary classification. By combining 1000 best features from VGG-19 and MobileNet-V2 models, the suggested approach records the best accuracy (97.59%), Recall (98.33%), and F1-score (98.19%) on the test set based on the supervised support vector machines (SVM) classifier. The achieved results demonstrated that our approach provides a significant enhancement in performance compared to previous studies using various ensemble fusion techniques while ensuring computational efficiency. We project this fused-feature system to significantly aid timely detection of childhood pneumonia, especially within constrained healthcare systems. Full article
(This article belongs to the Section Medical Imaging)
28 pages, 14898 KB  
Article
Deep Learning for Classification of Internal Defects in Fused Filament Fabrication Using Optical Coherence Tomography
by Valentin Lang, Qichen Zhu, Malgorzata Kopycinska-Müller and Steffen Ihlenfeldt
Appl. Syst. Innov. 2026, 9(2), 42; https://doi.org/10.3390/asi9020042 - 14 Feb 2026
Viewed by 148
Abstract
Additive manufacturing is increasingly adopted for the industrial production of small series of functional components, particularly in thermoplastic strand extrusion processes such as Fused Filament Fabrication. This transition relies on technological advances addressing key process limitations, including dimensional instability, weak interlayer bonding, extrusion [...] Read more.
Additive manufacturing is increasingly adopted for the industrial production of small series of functional components, particularly in thermoplastic strand extrusion processes such as Fused Filament Fabrication. This transition relies on technological advances addressing key process limitations, including dimensional instability, weak interlayer bonding, extrusion defects, moisture sensitivity, and insufficient melting. Process monitoring therefore focuses on early defect detection to minimize failed builds and costs, while ultimately enabling process optimization and adaptive control to mitigate defects during fabrication. For this purpose, a data processing pipeline for monitoring Optical Coherence Tomography images acquired in Fused Filament Fabrication is introduced. Convolutional neural networks are used for the automatic classification of tomographic cross-sections. A dataset of tomographic images passes semi-automatic labeling, preprocessing, model training and evaluation. A sliding window detects outlier regions in the tomographic cross-sections, while masks suppress peripheral noise, enabling label generation based on outlier ratios. Data are split into training, validation, and test sets using block-based partitioning to limit leakage. The classification model employs a ResNet-V2 architecture with BottleneckV2 modules. Hyperparameters are optimized, with N = 2, K = 2, dropout 0.5, and learning rate 0.001 yielding best performance. The model achieves 0.9446 accuracy and outperforms EfficientNet-B0 and VGG16 in accuracy and efficiency. Full article
(This article belongs to the Special Issue AI-Driven Decision Support for Systemic Innovation)
Show Figures

Figure 1

13 pages, 2154 KB  
Article
A Deep Learning Approach for Classifying Benign, Malignant, and Borderline Ovarian Tumors Using Convolutional Neural Networks and Generative Adversarial Networks
by Maria Giourga, Ioannis Petropoulos, Sofoklis Stavros, Anastasios Potiris, Kallirroi Goula, Efthalia Moustakli, Anthi-Maria Papahliou, Maria-Anastasia Daskalaki, Margarita Segou, Alexandros Rodolakis, George Daskalakis and Ekaterini Domali
Med. Sci. 2026, 14(1), 89; https://doi.org/10.3390/medsci14010089 - 14 Feb 2026
Viewed by 112
Abstract
Background/Objectives: Accurate preoperative characterization of ovarian masses is essential for appropriate clinical management, particularly for borderline ovarian tumors (BOTs), which are less common and often difficult to distinguish from benign or malignant lesions on ultrasound. Although expert subjective ultrasound assessment achieves high [...] Read more.
Background/Objectives: Accurate preoperative characterization of ovarian masses is essential for appropriate clinical management, particularly for borderline ovarian tumors (BOTs), which are less common and often difficult to distinguish from benign or malignant lesions on ultrasound. Although expert subjective ultrasound assessment achieves high diagnostic accuracy, limited availability of highly trained sonologists restricts its widespread application. Artificial intelligence-based approaches offer a potential solution; however, the low prevalence of BOTs restricts the development of robust deep learning models due to severe class imbalance. This study aimed to develop a Convolutional Neural Network (CNN)-based classifier enhanced with Generative Adversarial Networks (GANs) to improve the discrimination of ovarian masses as benign, malignant, or BOT using ultrasound images. Methods: A total of 3816 ultrasound images from 636 ovarian masses were retrospectively analyzed, including 390 benign lesions, 202 malignant tumors, and 44 BOTs. To address class imbalance, a Deep Convolutional GAN (DCGAN) was used to generate 2000 synthetic BOT images for data augmentation. A three-class ensemble CNN model integrating VGG16, ResNet50, and InceptionNetV3 architectures was developed. Performance was assessed on an independent test set and compared with a baseline model trained without DCGAN augmentation. Results: The incorporation of DCGAN-generated BOT images significantly enhanced classification performance. The BOT F1-score increased from 68.4% to 86.5%, while overall accuracy improved from 84.7% to 91.5%. For BOT identification, the final model achieved a sensitivity of 88.2% and specificity of 85.1%. Class-specific AUCs were 0.96 for benign lesions, 0.94 for malignant tumors, and 0.91 for BOTs. Conclusions: DCGAN-based augmentation effectively expands limited ultrasound datasets and improves CNN performance, particularly for BOT detection. This approach demonstrates potential as a decision support tool for preoperative assessment of ovarian masses. Full article
(This article belongs to the Section Gynecology)
Show Figures

Figure 1

23 pages, 2666 KB  
Article
A Study on ACCC Surface Defect Classification Method Using ResNet18 with Integrated SE Attention Mechanism
by Wenlong Xiao and Rui Chen
Appl. Sci. 2026, 16(4), 1899; https://doi.org/10.3390/app16041899 - 13 Feb 2026
Viewed by 134
Abstract
Surface defect detection in aluminum-based composite core conductors (ACCC) via X-ray imaging has long been constrained by challenges such as small sample sizes, class imbalance, model redundancy, and inadequate adaptation to single-channel industrial images. To address this, this paper proposes SE-ResNet18, a lightweight [...] Read more.
Surface defect detection in aluminum-based composite core conductors (ACCC) via X-ray imaging has long been constrained by challenges such as small sample sizes, class imbalance, model redundancy, and inadequate adaptation to single-channel industrial images. To address this, this paper proposes SE-ResNet18, a lightweight classification model synergistically designed for industrial single-channel X-ray images. The model features a co-adapted architecture where a single-channel input layer (preserving native image information and eliminating RGB conversion overhead) is coupled with a channel attention mechanism (to amplify subtle defect features), all within a globally optimized lightweight framework. With targeted data augmentation and robust training strategies, the model achieves superior performance on the ACCC defect dataset: classification accuracy reaches 98.39%, while excelling in lightweight design (12.0 million parameters) and real-time capability (0.44 ms/image inference speed). The experiments demonstrate that the proposed model exhibits high classification accuracy in testing while offering superior lightweight characteristics and inference efficiency. This provides a feasible solution for achieving high-precision detection and real-time processing in industrial scenarios, showcasing potential for ACCC online detection applications. Full article
(This article belongs to the Special Issue AI Applications in Modern Industrial Systems)
Show Figures

Figure 1

21 pages, 7276 KB  
Article
SkySeg-Net: Sky Segmentation-Based Row-Terminal Recognition in Trellised Orchards
by Haiyang Gu, Yong Wang, Huaiyang Liu, Tong Tian, Changxing Geng and Yun Shi
Mach. Learn. Knowl. Extr. 2026, 8(2), 46; https://doi.org/10.3390/make8020046 - 13 Feb 2026
Viewed by 148
Abstract
Perception in trellised orchards is often challenged by dense canopy occlusion and overhead plastic coverings, which cause pronounced variations in sky visibility at row terminals. Accurately recognizing row terminals, including both row head and row tail positions, is therefore essential for understanding orchard [...] Read more.
Perception in trellised orchards is often challenged by dense canopy occlusion and overhead plastic coverings, which cause pronounced variations in sky visibility at row terminals. Accurately recognizing row terminals, including both row head and row tail positions, is therefore essential for understanding orchard row structures. This study presents SkySeg-Net, a sky segmentation-based framework for row-terminal recognition in trellised orchards. SkySeg-Net is built on an enhanced multi-scale U-Net architecture and employs ResNeSt residual split-attention blocks as the backbone. To improve feature discrimination under complex illumination and occlusion conditions, the Convolutional Block Attention Module (CBAM) is integrated into the downsampling path, while a Pyramid Pooling Module (PPM) is introduced during upsampling to strengthen multi-scale contextual representation. Sky regions are segmented from both front-view and rear-view camera images, and a hierarchical threshold-based pixel-sum analysis is applied to infer row-terminal locations based on sky-region distribution patterns. To support a comprehensive evaluation, a dedicated trellised vineyard dataset was constructed, featuring front-view and rear-view images and covering three representative grapevine growth stages (BBCH 69–71, 73–77, and 79–89). Experimental results show that SkySeg-Net achieves an mIoU of 91.21% and an mPA of 94.82% for sky segmentation, with a row-terminal recognition accuracy exceeding 98.17% across all growth stages. These results demonstrate that SkySeg-Net provides a robust and reliable visual perception approach for row-terminal recognition in trellised orchard environments. Full article
(This article belongs to the Section Data)
Show Figures

Figure 1

19 pages, 15602 KB  
Article
DK-EffiPointMLP: An Efficient 3D Dorsal Point Cloud Network for Individual Identification of Pigs
by Yuhang Li, Nan Yang, Juan Liu, Yongshuai Yang, Shuai Zhang, Jiaxin Feng, Jie Hu and Fuzhong Li
Animals 2026, 16(4), 590; https://doi.org/10.3390/ani16040590 - 13 Feb 2026
Viewed by 110
Abstract
Accurate non-contact individual identification of pigs is crucial for their intelligent and efficient management. However, traditional recognition technologies generally suffer from weak local feature expression, feature redundancy, and insufficient channel importance modeling. To address these challenges, this study proposes a novel network model, [...] Read more.
Accurate non-contact individual identification of pigs is crucial for their intelligent and efficient management. However, traditional recognition technologies generally suffer from weak local feature expression, feature redundancy, and insufficient channel importance modeling. To address these challenges, this study proposes a novel network model, DK-EffiPointMLP, for individual identification based on 3D dorsal point clouds. The model integrates a Dual-branch Local Feature enhancement module (DLF) and an Efficient Partial Convolution-Residual Refinement module (EffiConv). Specifically, the DLF module adopts a dual-branch structure of KNN and dilated KNN to expand the receptive field, while the EffiConv module combines 1D convolution with the SE mechanism to strengthen key channel modeling. To evaluate the model, a dataset of 10 individual pigs with 8411 samples was constructed. Experimental results show that DK-EffiPointMLP achieves accuracies of 96.86% on this self-built dataset and 95.2% on ModelNet40. When re-training all baseline models under the same pipeline and preprocessing protocols, our model outperformed existing mainstream models by 2.74 and 1.1 percentage points, respectively. This approach provides an efficient solution for automated management in commercial farming. Full article
(This article belongs to the Section Pigs)
Show Figures

Figure 1

23 pages, 16195 KB  
Article
Integrating ShuffleNetV2 with Multi-Scale Feature Extraction and Coordinate Attention Combined with Knowledge Distillation for Apple Leaf Disease Recognition
by Wei-Chia Lo and Chih-Chin Lai
Algorithms 2026, 19(2), 151; https://doi.org/10.3390/a19020151 - 13 Feb 2026
Viewed by 73
Abstract
Misdiagnosing plant diseases often leads to a range of negative consequences, including the overuse of pesticides and unnecessary food waste. Traditionally, identifying diseases on plant leaves has relied on manual visual inspection, making it a complex and time-consuming task. Since the advent of [...] Read more.
Misdiagnosing plant diseases often leads to a range of negative consequences, including the overuse of pesticides and unnecessary food waste. Traditionally, identifying diseases on plant leaves has relied on manual visual inspection, making it a complex and time-consuming task. Since the advent of convolutional neural networks, however, recognition performance for leaf diseases has improved significantly. Most contemporary studies that apply AI techniques to plant-leaf disease classification focus primarily on boosting accuracy, frequently overlooking the limitations posed by resource-constrained real-world environments. To address these challenges, this thesis employs knowledge distillation to enable small models to approximate the recognition capabilities of larger ones. We enhance a ShuffleNetV2-based model by integrating multi-scale feature extraction and a coordinate-attention mechanism, and we further improve the lightweight student model through knowledge distillation to boost its recognition performance. Experimental results show that the proposed model achieves 93.15% accuracy on the Plant Pathology 2021- FGVC8 dataset, utilizing only 0.36 M parameters and 0.0931 GFLOPs. Compared to the ResNet50 baseline, our architecture slashes parameters by nearly 98% while limiting the accuracy gap to a mere 1.6%. These results confirm the model’s ability to maintain robust performance with minimal computational overhead, providing a practical solution for precision agriculture on resource-limited edge devices. Full article
(This article belongs to the Special Issue Machine Learning for Pattern Recognition (3rd Edition))
Show Figures

Figure 1

17 pages, 1091 KB  
Article
ASD Recognition Through Weighted Integration of Landmark-Based Handcrafted and Pixel-Based Deep Learning Features
by Asahi Sekine, Abu Saleh Musa Miah, Koki Hirooka, Najmul Hassan, Md. Al Mehedi Hasan, Yuichi Okuyama, Yoichi Tomioka and Jungpil Shin
Computers 2026, 15(2), 124; https://doi.org/10.3390/computers15020124 - 13 Feb 2026
Viewed by 184
Abstract
Autism Spectrum Disorder (ASD) is a neurological condition that affects communication and social interaction skills, with individuals experiencing a range of challenges that often require specialized care. Automated systems for recognizing ASD face significant challenges due to the complexity of identifying distinguishing features [...] Read more.
Autism Spectrum Disorder (ASD) is a neurological condition that affects communication and social interaction skills, with individuals experiencing a range of challenges that often require specialized care. Automated systems for recognizing ASD face significant challenges due to the complexity of identifying distinguishing features from facial images. This study proposes an incremental advancement in ASD recognition by introducing a dual-stream model that combines handcrafted facial-landmark features with deep learning-based pixel-level features. The model processes images through two distinct streams to capture complementary aspects of facial information. In the first stream, facial landmarks are extracted using MediaPipe (v0.10.21),with a focus on 137 symmetric landmarks. The face’s position is adjusted using in-plane rotation based on eye-corner angles, and geometric features along with 52 blendshape features are processed through Dense layers. In the second stream, RGB image features are extracted using pre-trained CNNs (e.g., ResNet50V2, DenseNet121, InceptionV3) enhanced with Squeeze-and-Excitation (SE) blocks, followed by feature refinement through Global Average Pooling (GAP) and DenseNet layers. The outputs from both streams are fused using weighted concatenation through a softmax gate, followed by further feature refinement for classification. This hybrid approach significantly improves the ability to distinguish between ASD and non-ASD faces, demonstrating the benefits of combining geometric and pixel-based features. The model achieved an accuracy of 96.43% on the Kaggle dataset and 97.83% on the YTUIA dataset. Statistical hypothesis testing further confirms that the proposed approach provides a statistically meaningful advantage over strong baselines, particularly in terms of classification correctness and robustness across datasets. While these results are promising, they show incremental improvements over existing methods, and future work will focus on optimizing performance to exceed current benchmarks. Full article
(This article belongs to the Special Issue Machine and Deep Learning in the Health Domain (3rd Edition))
Show Figures

Figure 1

Back to TopTop