Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (220)

Search Parameters:
Keywords = visual geometry group

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
23 pages, 1814 KB  
Article
Quantifying Typological Repetition, Compactness, and Domestic Spatial Quality in Finnish Apartment Layouts: Evidence from Aurinkolahti, Helsinki
by Vera Bijelić and Yasmany García-Ramírez
Architecture 2026, 6(3), 106; https://doi.org/10.3390/architecture6030106 (registering DOI) - 5 Jul 2026
Abstract
Apartment floor plans have traditionally been interpreted through qualitative typological readings, but coded floor-plan datasets now enable reproducible quantitative analysis. This article examines architectural metadata from apartment units in Aurinkolahti, Helsinki, Finland, to explore typological repetition, spatial compactness, and selected indicators of domestic [...] Read more.
Apartment floor plans have traditionally been interpreted through qualitative typological readings, but coded floor-plan datasets now enable reproducible quantitative analysis. This article examines architectural metadata from apartment units in Aurinkolahti, Helsinki, Finland, to explore typological repetition, spatial compactness, and selected indicators of domestic spatial quality. Rather than reconstructing full apartment geometries, the study treats each mapped layout as an architectural metadata record and distinguishes it from the number of built apartments represented by that layout. This stock-sensitive distinction allows repeated floor plans to be analyzed as components of housing production, not merely as individual cases. The workflow includes data cleaning, classification-coverage assessment, weighted and unweighted summaries, temporal grouping, compactness metrics, repetition indicators, kitchen-related quality markers, exploratory clustering, and statistical modeling. Results show typological concentration around a limited number of layout families and identify the 2021–2023 period as combining smaller weighted mean floor areas, higher repetition intensity, and lower kitchen natural-light shares. Coded floor-plan metadata are therefore positioned as a limited but useful intermediate form of architectural evidence: more structured than visual typological description, but not a substitute for direct plan analysis or comprehensive housing-quality assessment. Full article
(This article belongs to the Special Issue Architecture in the Digital Age)
Show Figures

Figure 1

16 pages, 7606 KB  
Article
Image Processing and Deep Convolutional Neural Network Method for Automated Malaria Parasite Detection in Thin Blood Slide Images
by Kavita Kumari, Taruna Kaura, Abhishek Mewara, Suman Tewary and Neerja Mittal Garg
Diagnostics 2026, 16(13), 2091; https://doi.org/10.3390/diagnostics16132091 - 3 Jul 2026
Abstract
Background: Malaria is a life-threatening disease caused by Plasmodium species, which is endemic in tropical and subtropical regions worldwide. In clinical settings, experienced parasitologists perform microscopic examinations of thick/thin blood slides. This method is labour-intensive and is adversely affected by inter- and intra-observer [...] Read more.
Background: Malaria is a life-threatening disease caused by Plasmodium species, which is endemic in tropical and subtropical regions worldwide. In clinical settings, experienced parasitologists perform microscopic examinations of thick/thin blood slides. This method is labour-intensive and is adversely affected by inter- and intra-observer variability among the microscopists. The present study aimed to develop a malaria screening algorithm using computer vision to identify and classify malaria parasite-infected red blood cells (RBC) from microscopic blood slide images. Methods: The proposed classification methodology first employs digital image processing techniques, the watershed transform, to preprocess the raw images, followed by connected component labelling to accurately segment and isolate individual RBCs from the background. To classify these segmented cells as either normal or infected, convolutional neural networks (CNNs) were utilized, leveraging their ability to automatically extract relevant features through deep, hidden layers, thus eliminating the need for manual feature engineering. Results: To compare and determine the most effective classification engine, the study developed and evaluated five distinct models: four well-established transfer learning architectures (VGG16, VGG19, DenseNet121, and InceptionV3), alongside a newly proposed custom CNN model. A total of 2422 segmented RBC images were used for the training, and 692 different images were used for testing, with the VGG model showing the best accuracy at 99.57%. The proposed CNN architecture also showed competitive results with 99.14% accuracy. Conclusions: Transfer learning models demonstrated remarkable accuracy for malaria parasite classification from blood smear slides, with VGG19 (99.57%) achieving the highest accuracy on diverged datasets for the test images. The analysis demonstrates the potential of this approach as a computational aid for future image-based malaria screening in conjunction with existing diagnostic tests. Full article
(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)
Show Figures

Figure 1

23 pages, 31441 KB  
Article
Identification of Acoustic Emission Spectrograms from Limestone Fracturing Based on a Novel Deep Learning Model
by Yan Zhang, Daojing Guo, Yulong Ye, Lantao Huang, Cong Fan, Jiancheng Huang and Mingdong Wei
Sensors 2026, 26(13), 4157; https://doi.org/10.3390/s26134157 - 1 Jul 2026
Viewed by 234
Abstract
The progressive development of microscopic fractures within rock masses is a primary mechanism of macroscopic failure, threatening the structural integrity of rock engineering systems. In this paper, a novel deep learning model, Principal Component Analysis (PCA)-Visual Geometry Group 16 (VGG16), is developed to [...] Read more.
The progressive development of microscopic fractures within rock masses is a primary mechanism of macroscopic failure, threatening the structural integrity of rock engineering systems. In this paper, a novel deep learning model, Principal Component Analysis (PCA)-Visual Geometry Group 16 (VGG16), is developed to accurately identify spectrogram features associated with limestone fractures. In this architecture, a PCA-based convolution encoder is seamlessly integrated as a foundational preprocessing layer before feedforwarding into the deep neural network to execute linear feature purification. The model is first validated on standard image datasets comprising handwritten digits and facial images to evaluate classification performance. Subsequently, acoustic emission signals are acquired during triaxial compression tests on limestone specimens pretreated with cyclic acid–alkali exposure. The PCA-VGG16 framework is then employed to classify the corresponding acoustic spectrograms, and its performance is quantitatively compared with a conventional convolutional neural network (CNN) and the standard VGG16 model. The results indicate that the PCA-VGG16 model achieves classification accuracies that are 19.19% and 10.77% higher than the conventional CNN and standard VGG16 models, respectively. In terms of computational efficiency, the training time is reduced by 35.00% and 23.53% compared to CNN and VGG16. The superior classification performance of the proposed PCA-VGG16 model enables accurate identification of internal microscopic fracture characteristics in limestone. Furthermore, the integration of acoustic emission signals with deep learning models offers an effective approach for quantifying internal fracture levels and predicting the progressive failure of rocks. Full article
Show Figures

Figure 1

37 pages, 2414 KB  
Article
Spatially Aware Pair Proposal for Panoptic Scene Graph Generation
by Hanzhu Dai, Qiang Zhang, Binghao Wang and Mai Liu
Sensors 2026, 26(13), 4119; https://doi.org/10.3390/s26134119 - 30 Jun 2026
Viewed by 189
Abstract
Images captured by vision sensors provide visual evidence for scene understanding, including object appearances, pixel-level regions, and spatial relations among entities. Panoptic Scene Graph Generation (PSG) constructs structured scene representations by grounding visual entities with panoptic masks and predicting relationships among objects and [...] Read more.
Images captured by vision sensors provide visual evidence for scene understanding, including object appearances, pixel-level regions, and spatial relations among entities. Panoptic Scene Graph Generation (PSG) constructs structured scene representations by grounding visual entities with panoptic masks and predicting relationships among objects and regions. In pair-then-relation PSG pipelines, subject–object pair recall is critical to final triplet recall. However, existing pair proposal approaches mainly score candidate subject–object pairs based on object–query feature matching, while mask-derived spatial cues such as object locations, relative geometry, and local layouts remain underexplored. Consequently, ground-truth subject–object pairs may be excluded from the Top-Kr proposals before relation decoding. To address this problem, this paper proposes a Spatially Aware Pair Proposal Model (SAPPM), which incorporates mask-derived soft centroids, relative geometry, and local-neighborhood context into pair scoring. SAPPM uses Grouped Vector Attention (GVA) to model local spatial interactions and introduces a spatially adaptive gating module to calibrate spatial-branch contributions. Experiments on the PSG dataset under the Scene Graph Detection (SGDet) protocol show that SAPPM achieves competitive performance, reaching 32.53 R@20 and 27.36 mR@20. These results indicate that SAPPM improves PSG performance by enhancing ground-truth pair coverage in the candidate proposal set. Full article
Show Figures

Figure 1

16 pages, 4123 KB  
Article
Goniochromism of Multicolor and Interference Pigments Under Varying Illumination Conditions
by Mirica Karlovits, Blaž Likozar and Uroš Novak
Appl. Sci. 2026, 16(12), 6103; https://doi.org/10.3390/app16126103 - 16 Jun 2026
Viewed by 159
Abstract
Color results from the interaction of objects with varying wavelengths of light and the human visual system’s perception under different illumination conditions. In this study, special emphasis was placed on examining how varying illumination conditions and measurement geometries affect the color appearance and [...] Read more.
Color results from the interaction of objects with varying wavelengths of light and the human visual system’s perception under different illumination conditions. In this study, special emphasis was placed on examining how varying illumination conditions and measurement geometries affect the color appearance and optical properties of printed effect pigments. Two distinct groups of pigments were examined: three interference pigments (M-series) based on calcium–aluminum borosilicate substrates, and three multicolor pigments (C-series) based on silicon dioxide. To ensure comparability of the results, all pigments were printed using screen printing techniques onto black PVC film. Characterization involved using a multi-angle spectrophotometer to measure CIEL*a*b* values, chroma (C*), and hue (h*) under CIE standard illuminants D50, A2, and F2 at a fixed illumination angle of 45° and aspecular angles of −15°, 15°, 25°, 45°, 75°, and 110°. Furthermore, the research methodology included the evaluation of lightness difference (∆L*), color differences (∆E*), chroma difference (∆C*), and hue difference (∆H*), with the D50 illuminant chosen as the reference and A2 and F2 as sample illuminants. The flop index (FI), as the indicator of lightness change at different scattering angles, was calculated for all printed pigments under all three standard illuminations. This multidisciplinary approach provided a deeper understanding of the relationship between pigment structure, illumination conditions, and viewing angles in our visual perception of printed pigments, which is of great importance for the development and optimization of goniochromatic materials. The results showed that while A2 and F2 illuminants have a negligible impact on lightness differences across all pigments, they induce noticeable variations in color, chroma, and hue differences, particularly at near-specular angles (−15° and 15°). Conversely, these differences become negligible at far-aspecular angles (75° and 110°). Furthermore, flop index (FI) analysis revealed that despite the larger borosilicate flakes in the M-series, the silicon dioxide-based C-series pigments exhibited the highest overall flop effect, with pigment C1 maintaining consistently high FI values under all illuminants. Full article
(This article belongs to the Section Chemical and Molecular Sciences)
Show Figures

Figure 1

26 pages, 2191 KB  
Article
Convolutional Neural Networks: Biological Foundations, Hidden Limitations, and Future Directions
by Luis Sacouto and Andreas Wichert
Electronics 2026, 15(12), 2654; https://doi.org/10.3390/electronics15122654 - 15 Jun 2026
Viewed by 299
Abstract
Convolutional neural networks (CNN) have transformed visual recognition, yet robust geometric reasoning, reliable out-of-distribution generalization, and recognition from limited data remain substantially unsolved. CNNs draw their architectural inspiration from the mammalian visual cortex, but the translation from biology to engineering was selective and, [...] Read more.
Convolutional neural networks (CNN) have transformed visual recognition, yet robust geometric reasoning, reliable out-of-distribution generalization, and recognition from limited data remain substantially unsolved. CNNs draw their architectural inspiration from the mammalian visual cortex, but the translation from biology to engineering was selective and, in places, imprecise, and those imprecisions have consequences that are well documented. This paper examines where the biological fidelity holds and where it gives way, grounding the analysis in formal results that predate deep learning and in recent empirical findings on CNN failure modes. We identify three diagnosable architectural limitations. First, CNNs conflate visual modalities that the biological system separates structurally at the lateral geniculate nucleus, feeding raw RGB pixels into a single undifferentiated filter bank and entangling orientation, color, and texture signals from the first layer onward. Second, CNNs repeat a spatial subsampling operation across the full depth of the network, far beyond the early visual cortex stages where it has biological warrant. Barnard and Casasent established formally in 1990 that this operation discards positional information irreversibly at every layer where it is applied, and repeating it into regions that correspond to V4 and inferotemporal cortex compounds this loss without the compensating transition to qualitatively different computations that the biological hierarchy performs. Third, the pooling-as-complex-cell analogy that motivated this design reflects a misreading of what complex cells compute. The spatiotemporal energy model formalizes complex cell behavior as geometry extraction: detecting the presence and orientation of a local edge structure robustly, abstracting over photometric accidents of contrast polarity and sub-wavelength phase that are not geometrically meaningful. Pooling is a tolerable first-stage approximation of this behavior, but as a general-purpose invariance mechanism repeated across the full depth of the network, it is attempting something categorically different, namely object-level position invariance through spatial subsampling, which achieves its goal by discarding exactly the geometric information that the energy model preserves. Treating pooling as a scalable, indefinitely repeatable implementation of complex cell behavior—rather than as a first-stage approximation with a natural biological endpoint at V3—conflates two operations that differ not in degree but in kind, and crucially it removed the principled criterion for confining the S-C operation to early visual cortex: because pooling was understood as a general-purpose invariance mechanism, the field had no architectural reason to stop repeating it. We survey how capsule networks, group-equivariant CNNs, PDE-based networks, and vision transformers each address one or two of these limitations while leaving the others intact. We propose six desiderata that a more biologically complete architecture would need to satisfy and argue that satisfying them requires treating the visual cortex’s solution as a coherent package in which each component depends on the others working correctly, rather than as a menu of independently selectable principles. Full article
(This article belongs to the Special Issue Convolutional Neural Networks and Vision Applications, 4th Edition)
Show Figures

Figure 1

35 pages, 5529 KB  
Article
Occasion-Based Clothing Classification Using Vision Transformer and Traditional Machine Learning Models
by Hanaa Alzahrani, Maram Almotairi and Arwa Basbrain
Computers 2026, 15(4), 249; https://doi.org/10.3390/computers15040249 - 17 Apr 2026
Viewed by 967
Abstract
Clothing classification by occasion is an important area in computer vision and artificial intelligence (AI). This task is particularly challenging because of the subtle visual similarities among clothing categories such as formal, party, and casual attire. Variations in color, fabric, patterns, and lighting [...] Read more.
Clothing classification by occasion is an important area in computer vision and artificial intelligence (AI). This task is particularly challenging because of the subtle visual similarities among clothing categories such as formal, party, and casual attire. Variations in color, fabric, patterns, and lighting further increase the complexity of this task. To address this challenge, we used the Fashionpedia dataset to create a balanced subset of 15,000 images. Specifically, we adopted two different methods for labeling these images: automated classification, which relies on category identifications (IDs) and components, and manual labeling performed by human annotators. We then implemented our preprocessing pipeline, which includes several steps: resizing, image normalization, background removal using segmentation masks, and class balancing. We benchmarked traditional models, including artificial neural networks (ANNs), support vector machines (SVMs), and k-nearest neighbors (KNNs), which use a histogram of oriented gradient (HOG) features, as well as deep learning models such as convolutional neural networks (CNNs), the Visual Geometry Group 16 (VGG16) model utilizing transfer learning, and the vision transformer (ViT) model, all evaluated using identical data splits and preprocessing procedures. The traditional models achieved moderate accuracy, ranging from 54% to 66%. In contrast, the ViT model achieved an accuracy of 81.78% with automated classification and 98.09% with manual labeling. This indicates that a higher label accuracy, along with the preprocessing steps used, significantly enhances the performance. Together, these factors improve the effectiveness of ViT in context-aware apparel classification and establish a reliable baseline for future research. Full article
(This article belongs to the Special Issue Machine Learning: Innovation, Implementation, and Impact)
Show Figures

Figure 1

12 pages, 796 KB  
Proceeding Paper
Design of a Lightweight Video-Based Ear Biometric System on Raspberry Pi 5 Using You Only Look Once Version 12 and EfficientNet-4
by Kristian Emmanuel Padilla, Michael Robin Saculsan and John Paul Cruz
Eng. Proc. 2026, 134(1), 50; https://doi.org/10.3390/engproc2026134050 - 14 Apr 2026
Viewed by 738
Abstract
Recent advances in ear biometrics have yielded increasingly accurate detection and recognition methods, driven by the ear’s uniqueness and permanence as a non-invasive biometric modality. Nonetheless, several limitations persist, including computationally demanding models, inconsistent evaluation metrics, and portable systems restricted by manual capture [...] Read more.
Recent advances in ear biometrics have yielded increasingly accurate detection and recognition methods, driven by the ear’s uniqueness and permanence as a non-invasive biometric modality. Nonetheless, several limitations persist, including computationally demanding models, inconsistent evaluation metrics, and portable systems restricted by manual capture and limited datasets. To address these challenges, we developed a lightweight, video-based ear biometric system implemented on the Raspberry Pi 5. The system integrates You Only Look Once Version 12 (YOLOv12) for ear detection, EfficientNet-4 for feature extraction, and k-Nearest Neighbors (k-NNs) for recognition. Its robust hardware platform combines Raspberry Pi 5 with the Raspberry Pi AI Camera and AI HAT+. To train, fine-tune, and optimize YOLOv12 and EfficientNet-4, we used the Visual Geometry Group (VGG)Face-Ear dataset for training and the Unconstrained Ear Recognition Challenge 2019 dataset for validation, with k-NN employed for classification. The system is evaluated for classification accuracy and system-level performance. 13 participants, comprising 10 enrolled and three unenrolled subjects, participated in testing the system. The enrolled participants registered in the system were correctly identified, whereas unenrolled participants were excluded and rejected. The system achieved 92.31% accuracy, 95.45% precision, 96.97% recall, and an F1-score of 0.95, confirming the feasibility of deploying advanced ear biometric methods on embedded, resource-constrained devices. Full article
Show Figures

Figure 1

7 pages, 1495 KB  
Proceeding Paper
Defect Identification of Trinitario Cacao Beans Using Residual Network-50 for Quality Control
by Jed Nathan L. Villapando, Kyle Aldrich R. Bordonada and Glenn V. Magwili
Eng. Proc. 2026, 134(1), 53; https://doi.org/10.3390/engproc2026134053 - 13 Apr 2026
Viewed by 368
Abstract
Cacao grading in the Philippines has relied on slow and inconsistent visual inspection. To effectively detect defects in Trinitario cacao beans, we developed a compact, low-cost computer vision system using single-bean images captured with a Raspberry Pi 5 and Camera Module 3 under [...] Read more.
Cacao grading in the Philippines has relied on slow and inconsistent visual inspection. To effectively detect defects in Trinitario cacao beans, we developed a compact, low-cost computer vision system using single-bean images captured with a Raspberry Pi 5 and Camera Module 3 under controlled lighting and distance conditions. The dataset comprises 1565 images, partitioned into training (80%), validation (10%), and testing (10%) sets. Each image was resized to 224 × 224 pixels, normalized with ImageNet statistics, and subjected to light augmentation. A ResNet-50 model was fine-tuned through transfer learning, employing AdamW optimization, warmup–cosine scheduling, label smoothing, exponential moving average, and early stopping, to classify beans into five categories: good, moldy, slaty, germinated, and over-fermented. On the held-out test set, the model achieved a 94.0% accuracy, strong per-class F1 scores, and high one-vs-rest mean average precision. Compared with a Visual Geometry Group-16 approach, which attained a 90.67% accuracy, the developed system improved performance by 3.3% while remaining inexpensive and easy to deploy. The lightweight system provides reliable and scalable cacao bean screening. Further improvements are anticipated through the expansion of underrepresented classes and refinement of class-specific thresholds. Full article
Show Figures

Figure 1

6 pages, 685 KB  
Proceeding Paper
Contactless Footprint Acquisition and Automated Identification Using Convolutional Neural Network
by Angelica A. Claros, Elmo Joaquin D. Estacion and Jocelyn F. Villaverde
Eng. Proc. 2026, 134(1), 30; https://doi.org/10.3390/engproc2026134030 - 3 Apr 2026
Viewed by 570
Abstract
Biometric systems are widely used in security and forensic applications. Conventionally, contact-based footprint scanners require physical contact, which presents significant limitations. These devices raise hygiene concerns and are impractical in field identification conditions, such as forensic investigations or disaster victim identification, where quick [...] Read more.
Biometric systems are widely used in security and forensic applications. Conventionally, contact-based footprint scanners require physical contact, which presents significant limitations. These devices raise hygiene concerns and are impractical in field identification conditions, such as forensic investigations or disaster victim identification, where quick and non-invasive methods are essential. To address these challenges, a contactless footprint acquisition and identification system was developed using image processing techniques and a Convolutional Neural Network (CNN) based on the Visual Geometry Group–16 layer architecture. The system employs a Raspberry Pi 4, a Logitech C922 camera, and a ring light to capture footprints without direct surface contact. Captured images are processed with Contrast Limited Adaptive Histogram Equalization (CLAHE) to improve contrast and mean thresholding to generate binary images for clearer feature extraction. System performance was evaluated using a multiclass confusion matrix. The CNN correctly classified 158 of 160 test images, achieving an accuracy of 98.75%. This result demonstrates higher accuracy than earlier studies that used older CNN models, such as Alex Krizhevsky’s Network and LeCun’s Network-5, which performed with fewer subjects and lower accuracy rates. The developed system shows potential for biometric security, forensic investigations, and disaster response, where contactless and reliable identification is required. Future research can expand the dataset with more diverse footprints, test performance under varied conditions, and extend the approach to other contactless biometrics such as palmprints or ears. Full article
Show Figures

Figure 1

13 pages, 1952 KB  
Article
Morphology-Evolving Colorimetric Thin-Film Sensor for Visual Detection of Hypochlorous Acid
by Yasumasa Kanekiyo, Takumi Kato and Emi Sakai
Sensors 2026, 26(7), 2082; https://doi.org/10.3390/s26072082 - 27 Mar 2026
Viewed by 565
Abstract
Hypochlorous acid (HClO) is widely used as a low-cost and effective disinfectant; however, its instability under heat and light necessitates simple and reliable monitoring methods. Herein, we report a morphology-evolving thin-film colorimetric sensor that enables intuitive visual detection of HClO through simultaneous color [...] Read more.
Hypochlorous acid (HClO) is widely used as a low-cost and effective disinfectant; however, its instability under heat and light necessitates simple and reliable monitoring methods. Herein, we report a morphology-evolving thin-film colorimetric sensor that enables intuitive visual detection of HClO through simultaneous color and pattern transitions. The sensor integrates two polymer films with distinct charge-state response behaviors, patterned in X-shaped and circular geometries on a single substrate. Upon exposure to HClO, chlorine-induced modification of amide and amine groups alters the surface charge states, thereby switching the adsorption preference for anionic and cationic dyes. This mechanism results in a pronounced transformation from a blue X-shaped motif to a red circular pattern, enabling direct visual discrimination between different HClO concentrations. Quantitative analysis of RGB values confirmed semi-quantitative detection in the sub-millimolar to millimolar range. The sensor exhibited a linear response in the range of 0–3 mM (R2 > 0.979) with a limit of detection of 0.103 mM. The sensor further demonstrated practical applicability by tracking photodecomposition of a commercial disinfectant. This work demonstrates pattern-coupled colorimetric sensing as a straightforward, user-friendly approach for HClO monitoring. Full article
(This article belongs to the Section Chemical Sensors)
Show Figures

Figure 1

31 pages, 7355 KB  
Article
Optimized Hybrid Feature Space for High-Efficiency Citrus Disease Diagnosis: A Fusion of Handcrafted Blue-Green-Red Color Moments and Deep Convolutional Descriptors
by Edgar Tello-Leal, Bárbara A. Macías-Hernández, Sarahi Rubio-Tinajero, Jaciel David Hernandez-Resendiz and Ulises Manuel Ramirez-Alcocer
Agriculture 2026, 16(6), 711; https://doi.org/10.3390/agriculture16060711 - 23 Mar 2026
Viewed by 1329
Abstract
Accurate and timely diagnosis of citrus diseases is essential for reducing economic losses in global agriculture. Although deep learning models provide high diagnostic accuracy, their computational demands often hinder deployment on resource-limited edge devices. To overcome this challenge, this study proposes an optimized [...] Read more.
Accurate and timely diagnosis of citrus diseases is essential for reducing economic losses in global agriculture. Although deep learning models provide high diagnostic accuracy, their computational demands often hinder deployment on resource-limited edge devices. To overcome this challenge, this study proposes an optimized hybrid framework for phytopathological classification. The methodology combines handcrafted descriptors (Blue-Green-Red “BGR” color statistical moments) with hierarchical spatial abstractions derived from a pre-trained Visual Geometry Group 16-layer (VGG16) deep architecture. An initial high-dimensional feature space was created by concatenating 360 handcrafted statistical descriptors and 12,800 deep textural features. By implementing a Wrapper-Greedy Stepwise selection strategy, this original space was reduced by over 96%. The resulting Elite Model identifies 12 and 18 critical attributes across two independent, transcontinental datasets (Mexico and Pakistan, respectively), effectively capturing both subtle chromatic anomalies and complex structural lesions. Experimental benchmarking confirms that this parsimonious hybrid approach delivers robust classification accuracy ranging from 87.30% to 95.23%, significantly outperforming unimodal architectures. Ultimately, this framework provides a highly efficient, interpretable, and scalable solution for real-time disease monitoring in precision agriculture. Full article
Show Figures

Figure 1

37 pages, 2981 KB  
Article
Signs, Shapes, and Spaces: A CAMIL-Informed Qualitative Study of Metaverse Geometry Learning for Deaf and Hard-of-Hearing Students
by Ai Peng Chong, Kung-Teck Wong, Kong Liang Soon Vestly and Kuppusamy Suresh Kumar
Soc. Sci. 2026, 15(3), 191; https://doi.org/10.3390/socsci15030191 - 16 Mar 2026
Viewed by 1128
Abstract
Deaf and Hard-of-Hearing (DHH) students face persistent barriers in geometry education due to instructional approaches that inadequately support visual communication and embodied learning. This study examined DHH students’ experiences with GeoMETriA, a metaverse-based geometry learning platform integrating sign language instruction, three-dimensional visualization, and [...] Read more.
Deaf and Hard-of-Hearing (DHH) students face persistent barriers in geometry education due to instructional approaches that inadequately support visual communication and embodied learning. This study examined DHH students’ experiences with GeoMETriA, a metaverse-based geometry learning platform integrating sign language instruction, three-dimensional visualization, and avatar-mediated interaction. Guided by the Cognitive Affective Model of Immersive Learning (CAMIL), a multi-phase qualitative design was employed, including pre-workshop interviews with four special education teachers and post-workshop focus group discussions with seven DHH secondary students following a four-session learning workshop. The findings indicate that gamified activities and peer collaboration enhanced interest and sustained engagement, while avatar customization supported embodiment and a sense of presence. Students described progression from initial uncertainty to greater confidence through practice and scaffolded support. However, cognitive and usability challenges emerged, particularly concerning sign language video pacing, navigation complexity, and limited instructional scaffolding. The study contributes theoretically by extending CAMIL-informed interpretations to sign-supported metaverse learning, empirically by documenting how engagement, embodiment, and self-efficacy develop during immersive geometry learning, and practically by offering design implications including adjustable sign language delivery, structured scaffolding, and culturally responsive avatar options. These findings suggest that metaverse-based platforms hold promise for supporting DHH learners when accessibility and learner-centered principles are embedded as foundational design considerations. Full article
(This article belongs to the Special Issue Belt and Road Together Special Education 2025)
Show Figures

Figure 1

29 pages, 7769 KB  
Article
Efficient Deep Learning Models Integrated with a Smart Web Application for Classifying Heart Diseases Based on ECG Signals
by Saeed Mohsen, Ahmed F. Ibrahim, Osama F. Hassan, Norah Alnaim, Noorah Albehaijan and M. Abdel-Aziz
Computers 2026, 15(3), 191; https://doi.org/10.3390/computers15030191 - 16 Mar 2026
Viewed by 1074
Abstract
Recent advancements in the accuracy of deep learning (DL) hold significant promise for improving the classification of heart patients. Nevertheless, continued refinement is essential to achieve even greater levels of precision in DL techniques. This paper proposes three efficient DL models: Swin Transformer [...] Read more.
Recent advancements in the accuracy of deep learning (DL) hold significant promise for improving the classification of heart patients. Nevertheless, continued refinement is essential to achieve even greater levels of precision in DL techniques. This paper proposes three efficient DL models: Swin Transformer (Swin-T), Visual Geometry Group (VGG)-19, and Vision Transformer (ViT), which are implemented to classify different types of heart patients. The three DL models are learned on a balanced dataset comprising 600 electrocardiogram (ECG) samples. This dataset contains three classes: Arrhythmia Patient, Myocardic Patient, and Normal Patient. The DL models are applied using a PyTorch framework v2.10.0, with fine-tuning for the models’ hyperparameters to maximize the classification accuracy, and data augmentation techniques are implemented for the ECG samples. Additionally, a smart web application is designed for classifying heart patients into three different diagnostic categories. The performance of the three models is assessed by several metrics such as area under precision-recall (AUPR) curves and normalized confusion matrices (NCMs). The proposed three models achieve high testing accuracy for the classification of heart patients. Regarding testing loss (TL) rates for the Swin-T, VGG-19, and ViT achieve rates of 0.0707, 0.4138, and 0.0015, respectively. Also, the ViT achieves an F1-score, true positive rate (TPR), and AUPR curves of 100%. Full article
(This article belongs to the Special Issue AI in Bioinformatics)
Show Figures

Figure 1

24 pages, 17028 KB  
Article
Lithology Identification via MSC-Transformer Network with Time-Frequency Feature Fusion
by Shiyi Xu, Sheng Wang, Jun Bai, Kun Lai, Jie Zhang, Qingfeng Wang and Jie Zhang
Appl. Sci. 2026, 16(4), 1949; https://doi.org/10.3390/app16041949 - 15 Feb 2026
Viewed by 542
Abstract
Real-time lithology identification during drilling faces challenges such as indistinct boundaries and difficulties in feature extraction. To address these, this study proposes the MSC-Transformer, a novel model integrating time-frequency features with a deep neural network. A series of drilling experiments were conducted using [...] Read more.
Real-time lithology identification during drilling faces challenges such as indistinct boundaries and difficulties in feature extraction. To address these, this study proposes the MSC-Transformer, a novel model integrating time-frequency features with a deep neural network. A series of drilling experiments were conducted using an intelligent drilling platform, during which triaxial vibration signals were collected from five types of rock specimens: anthracite, granite, bituminous coal, sandstone, and shale. Short-time Fourier Transform (STFT) was applied to generate multi-channel power spectral density (PSD) maps, which were then fused into a three-channel tensor to preserve directional frequency information and used as inputs to the model. The proposed MSC-Transformer combines a multi-scale convolutional (MSC) module with a lightweight Transformer encoder to jointly capture local texture patterns and global dependency features, thereby enabling accurate classification of complex lithologies. Experimental results demonstrate that the model achieves an average accuracy of 98.21 ± 0.49% on the test set, outperforming convolutional neural networks (CNNs), visual geometry group (VGG), residual network (ResNet), and bidirectional long short-term memory (Bi-LSTM) by 5.93 ± 0.90%, 2.54 ± 1.11%, 6.38 ± 2.63%, and 10.56 ± 3.11%, respectively, with statistically significant improvements (p < 0.05). Ablation studies and visualization analyses further validate the effectiveness and interpretability of the model architecture. These findings indicate that lithology recognition based on time-frequency representations of vibration signals is both stable and generalizable, offering technical support for real-time intelligent lithology identification during drilling operations. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

Back to TopTop