Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (133)

Search Parameters:
Keywords = multiview classification

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
19 pages, 1949 KB  
Article
Deep Learning for Building Attribute Classification from Street-View Images for Seismic Exposure Modeling
by Rajesh Kumar, Claudio Rota, Flavio Piccoli and Gianluigi Ciocca
Appl. Sci. 2026, 16(2), 875; https://doi.org/10.3390/app16020875 - 14 Jan 2026
Viewed by 151
Abstract
Exposure models are essential for seismic risk assessment to determine environmental vulnerabilities during earthquakes. However, developing these models at scale is challenging because it relies on manual inspection of buildings, which increases costs and introduces significant delays. Developing fast, consistent, and easy-to-deploy automated [...] Read more.
Exposure models are essential for seismic risk assessment to determine environmental vulnerabilities during earthquakes. However, developing these models at scale is challenging because it relies on manual inspection of buildings, which increases costs and introduces significant delays. Developing fast, consistent, and easy-to-deploy automated methods to support this process has become a priority. In this study, we investigate the use of deep learning to accelerate the classification of architectural and structural attributes from street-view imagery. Using the Alvalade dataset, which contains 4007 buildings annotated with 10 multi-class attributes, we evaluated the performance of multiple architecture types. Our analysis shows that deep learning models can successfully extract key structural features, achieving an average macro accuracy of 57%, and a Precision, Recall, and F1-score of 61%, 57%, and 56%, respectively. We also show that prediction quality is further improved by leveraging multi-view imagery of the target buildings. These results demonstrate that deep learning can be an effective solution to reduce the manual effort required for the development of reliable large-scale exposure models, offering a practical solution toward more efficient seismic risk assessment. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

33 pages, 40054 KB  
Article
MVDCNN: A Multi-View Deep Convolutional Network with Feature Fusion for Robust Sonar Image Target Recognition
by Yue Fan, Cheng Peng, Peng Zhang, Zhisheng Zhang, Guoping Zhang and Jinsong Tang
Remote Sens. 2026, 18(1), 76; https://doi.org/10.3390/rs18010076 - 25 Dec 2025
Viewed by 380
Abstract
Automatic Target Recognition (ATR) in single-view sonar imagery is severely hampered by geometric distortions, acoustic shadows, and incomplete target information due to occlusions and the slant-range imaging geometry, which frequently give rise to misclassification and hinder practical underwater detection applications. To address these [...] Read more.
Automatic Target Recognition (ATR) in single-view sonar imagery is severely hampered by geometric distortions, acoustic shadows, and incomplete target information due to occlusions and the slant-range imaging geometry, which frequently give rise to misclassification and hinder practical underwater detection applications. To address these critical limitations, this paper proposes a Multi-View Deep Convolutional Neural Network (MVDCNN) based on feature-level fusion for robust sonar image target recognition. The MVDCNN adopts a highly modular and extensible architecture consisting of four interconnected modules: an input reshaping module that adapts multi-view images to match the input format of pre-trained backbone networks via dimension merging and channel replication; a shared-weight feature extraction module that leverages Convolutional Neural Network (CNN) or Transformer backbones (e.g., ResNet, Swin Transformer, Vision Transformer) to extract discriminative features from each view, ensuring parameter efficiency and cross-view feature consistency; a feature fusion module that aggregates complementary features (e.g., target texture and shape) across views using max-pooling to retain the most salient characteristics and suppress noisy or occluded view interference; and a lightweight classification module that maps the fused feature representations to target categories. Additionally, to mitigate the data scarcity bottleneck in sonar ATR, we design a multi-view sample augmentation method based on sonar imaging geometric principles: this method systematically combines single-view samples of the same target via the combination formula and screens valid samples within a predefined azimuth range, constructing high-quality multi-view training datasets without relying on complex generative models or massive initial labeled data. Comprehensive evaluations on the Custom Side-Scan Sonar Image Dataset (CSSID) and Nankai Sonar Image Dataset (NKSID) demonstrate the superiority of our framework over single-view baselines. Specifically, the two-view MVDCNN achieves average classification accuracies of 94.72% (CSSID) and 97.24% (NKSID), with relative improvements of 7.93% and 5.05%, respectively; the three-view MVDCNN further boosts the average accuracies to 96.60% and 98.28%. Moreover, MVDCNN substantially elevates the precision and recall of small-sample categories (e.g., Fishing net and Small propeller in NKSID), effectively alleviating the class imbalance challenge. Mechanism validation via t-Distributed Stochastic Neighbor Embedding (t-SNE) feature visualization and prediction confidence distribution analysis confirms that MVDCNN yields more separable feature representations and more confident category predictions, with stronger intra-class compactness and inter-class discrimination in the feature space. The proposed MVDCNN framework provides a robust and interpretable solution for advancing sonar ATR and offers a technical paradigm for multi-view acoustic image understanding in complex underwater environments. Full article
(This article belongs to the Special Issue Underwater Remote Sensing: Status, New Challenges and Opportunities)
Show Figures

Graphical abstract

27 pages, 8159 KB  
Article
Less for Better: A View Filter-Driven Graph Representation Fusion Network
by Yue Wang, Xibei Yang, Keyu Liu, Qihang Guo and Xun Wang
Entropy 2026, 28(1), 26; https://doi.org/10.3390/e28010026 - 24 Dec 2025
Viewed by 219
Abstract
Multi-view learning has recently gained considerable attention in graph representation learning as it enables the fusion of complementary information from multiple views to enhance representation quality. However, most existing studies neglect that irrelevant views may introduce noise and negatively affect representation quality. To [...] Read more.
Multi-view learning has recently gained considerable attention in graph representation learning as it enables the fusion of complementary information from multiple views to enhance representation quality. However, most existing studies neglect that irrelevant views may introduce noise and negatively affect representation quality. To address the issue, we propose a novel multi-view representation learning framework called a View Filter-driven graph representation fusion network, named ViFi. Following the “less for better” principle, the framework focuses on filtering informative views while discarding irrelevant ones. Specifically, an entropy-based adaptive view filter was designed to dynamically filter the most informative views by evaluating their feature–topology entropy characteristics, aiming to not only reduce irrelevance among views but also enhance their complementarity. In addition, to promote more effective fusion of informative views, we propose an optimized fusion mechanism that leverages the filtered views to identify the optimal integration strategy using a novel information gain function. Through extensive experiments on classification and clustering tasks, ViFi demonstrates clear performance advantages over existing state-of-the-art approaches. Full article
Show Figures

Figure 1

29 pages, 31164 KB  
Article
Geometric Condition Assessment of Traffic Signs Leveraging Sequential Video-Log Images and Point-Cloud Data
by Yiming Jiang, Yuchun Huang, Shuang Li, Jun Liu and He Yang
Remote Sens. 2025, 17(24), 4061; https://doi.org/10.3390/rs17244061 - 18 Dec 2025
Viewed by 305
Abstract
Traffic signs exposed to long-term outdoor conditions frequently exhibit deformation, inclination, or other forms of physical damage, highlighting the need for timely and reliable anomaly assessment to support road safety management. While point-cloud data provide accurate three-dimensional geometric information, their sparse distribution and [...] Read more.
Traffic signs exposed to long-term outdoor conditions frequently exhibit deformation, inclination, or other forms of physical damage, highlighting the need for timely and reliable anomaly assessment to support road safety management. While point-cloud data provide accurate three-dimensional geometric information, their sparse distribution and lack of appearance cues make traffic sign extraction challenging in complex environments. High-resolution sequential video-log images captured from multiple viewpoints offer complementary advantages by providing rich color and texture information. In this study, we propose an integrated traffic sign detection and assessment framework that combines video-log images and mobile-mapping point clouds to enhance both accuracy and robustness. A dedicated YOLO-SIGN network is developed to perform precise detection and multi-view association of traffic signs across sequential images. Guided by these detections, a frustum-based point-cloud extraction strategy with seed-point density growing is introduced to efficiently isolate traffic sign panels and supporting poles. The extracted structures are then used for geometric parameterization and damage assessment, including inclination, deformation, and rotation. Experiments on 35 simulated scenes and nine real-world road scenarios demonstrate that the proposed method can reliably extract and evaluate traffic sign conditions in diverse environments. Furthermore, the YOLO-SIGN network achieves a localization precision of 91.16% and a classification mAP of 84.64%, outperforming YOLOv10s by 1.7% and 8.7%, respectively, while maintaining a reduced number of parameters. These results confirm the effectiveness and practical value of the proposed framework for large-scale traffic sign monitoring. Full article
Show Figures

Graphical abstract

26 pages, 9476 KB  
Article
Iron Ore Image Recognition Through Multi-View Evolutionary Deep Fusion Method
by Di Zhang, Xiaolong Qian, Chenyang Shi, Yuang Zhang, Yining Qian and Shengyue Zhou
Future Internet 2025, 17(12), 553; https://doi.org/10.3390/fi17120553 - 1 Dec 2025
Viewed by 298
Abstract
Iron ore image classification is essential for achieving high production efficiency and classification precision in mineral processing. However, real industrial environments face classification challenges due to small samples, inter-class similarity, and on-site noise. Existing methods are limited by single-view approaches that provide insufficient [...] Read more.
Iron ore image classification is essential for achieving high production efficiency and classification precision in mineral processing. However, real industrial environments face classification challenges due to small samples, inter-class similarity, and on-site noise. Existing methods are limited by single-view approaches that provide insufficient representation, difficulty in achieving adaptive balance between performance and complexity through manual or fixed feature selection and fusion, and susceptibility to overfitting with poor robustness under small sample conditions. To address these issues, this paper proposes the evolutionary deep fusion framework EDF-NSDE. The framework introduces multi-view feature extraction that combines lightweight and classical convolutional neural networks to obtain complementary features. Additionally, it was utilized to design evolutionary fusion that utilizes NSGA-II and differential evolution for multi-objective search to adaptively balance accuracy and model complexity while reducing overfitting and enhancing robustness through a generalization penalty and adaptive mutation. Furthermore, to overcome data limitations, we constructed a six-class dataset including hematite, magnetite, ilmenite, limonite, pyrite, and rock based on real production scenarios. The experimental results show that on our self-built dataset, EDF-NSDE achieves 84.86%/88.38% on original/augmented test sets, respectively, comprehensively outperforming other models. On a public seven-class mineral dataset, it achieves 92.51%, validating its generalization capability across different mineral types and imaging conditions. In summary, EDF-NSDE provides an automated feature fusion solution that achieves automated upgrading of the mineral classification process, contributing to the development of intelligent manufacturing technology and the industrial internet ecosystem. Full article
(This article belongs to the Special Issue Algorithms and Models for Next-Generation Vision Systems)
Show Figures

Figure 1

33 pages, 2821 KB  
Article
SwinCAMF-Net: Explainable Cross-Attention Multimodal Swin Network for Mammogram Analysis
by Lakshmi Prasanthi R. S. Narayanam, Thirupathi N. Rao and Deva S. Kumar
Diagnostics 2025, 15(23), 3037; https://doi.org/10.3390/diagnostics15233037 - 28 Nov 2025
Cited by 1 | Viewed by 572
Abstract
Background: Breast cancer is a leading cause of cancer-related mortality among women, and earlier diagnosis significantly improves treatment outcomes. However, traditional mammography-based systems rely on single-modality image analysis and lack integration of volumetric and clinical context, which limits diagnostic robustness. Deep learning [...] Read more.
Background: Breast cancer is a leading cause of cancer-related mortality among women, and earlier diagnosis significantly improves treatment outcomes. However, traditional mammography-based systems rely on single-modality image analysis and lack integration of volumetric and clinical context, which limits diagnostic robustness. Deep learning models have shown promising results in identification but are typically restricted to 2D feature extraction and lack cross-modal reasoning capability. Objective: This study proposes SwinCAMF-Net, a multimodal cross-attention Swin transformer network designed to improve joint breast lesion classification and segmentation by integrating multi-view mammography, 3D ROI volumes, and clinical metadata. Methods: SwinCAMF-Net employs a Swin transformer encoder for hierarchical visual representation learning from mammographic views, a 3D CNN volume encoder for lesion depth context modelling, and a clinical projection module to embed patient metadata. A novel cross-attentive fusion (CAF) module selectively aligns multimodal features through query–key attention. The fused feature representation branches into a classification head for malignancy prediction and a segmentation decoder for lesion localization. The model is trained and evaluated on CBIS-DDSM and RTM benchmark datasets. Results: SwinCAMF-Net achieved accuracy up to 0.978, an AUC-ROC of 0.998, and an F1-score of 0.944 for classification, while segmentation reached a Dice coefficient of 0.931. Ablation experiments confirm that the CAF module improves performance by up to 6.9%, demonstrating its effectiveness in multimodal fusion. Conclusion: SwinCAMF-Net advances breast cancer analysis by providing complementary multimodal evidence through a cross-attentive fusion, leading to improved diagnostic performance and clinical interpretability. The framework demonstrates strong potential in AI-assisted screening and radiology decision support. Full article
Show Figures

Figure 1

25 pages, 3379 KB  
Article
LPGGNet: Learning from Local–Partition–Global Graph Representations for Motor Imagery EEG Recognition
by Nanqing Zhang, Hongcai Jian, Xingchen Li, Guoqian Jiang and Xianlun Tang
Brain Sci. 2025, 15(12), 1257; https://doi.org/10.3390/brainsci15121257 - 23 Nov 2025
Viewed by 532
Abstract
Objectives: Existing motor imagery electroencephalography (MI-EEG) decoding approaches are constrained by their reliance on sole representations of brain connectivity graphs, insufficient utilization of multi-scale information, and lack of adaptability. Methods: To address these constraints, we propose a novel Local–Partition–Global Graph learning [...] Read more.
Objectives: Existing motor imagery electroencephalography (MI-EEG) decoding approaches are constrained by their reliance on sole representations of brain connectivity graphs, insufficient utilization of multi-scale information, and lack of adaptability. Methods: To address these constraints, we propose a novel Local–Partition–Global Graph learning Network (LPGGNet). The Local Learning module first constructs functional adjacency matrices using partial directed coherence (PDC), effectively capturing causal dynamic interactions among electrodes. It then employs two layers of temporal convolutions to capture high-level temporal features, followed by Graph Convolutional Networks (GCNs) to capture local topological features. In the Partition Learning module, EEG electrodes are divided into four partitions through a task-driven strategy. For each partition, a novel Gaussian median distance is used to construct adjacency matrices, and Gaussian graph filtering is applied to enhance feature consistency within each partition. After merging the local and partitioned features, the model proceeds to the Global Learning module. In this module, a global adjacency matrix is dynamically computed based on cosine similarity, and residual graph convolutions are then applied to extract highly task-relevant global representations. Finally, two fully connected layers perform the classification. Results: Experiments were conducted on both the BCI Competition IV-2a dataset and a laboratory-recorded dataset, achieving classification accuracies of 82.9% and 87.5%, respectively, which surpass several state-of-the-art models. The contribution of each module was further validated through ablation studies. Conclusions: This study demonstrates the superiority of integrating multi-view brain connectivities with dynamically constructed graph structures for MI-EEG decoding. Moreover, the proposed model offers a novel and efficient solution for EEG signal decoding. Full article
Show Figures

Figure 1

23 pages, 7043 KB  
Article
BiNeXt-SMSMVL: A Structure-Aware Multi-Scale Multi-View Learning Network for Robust Fundus Multi-Disease Classification
by Hongbiao Xie, Mingcheng Wang, Lin An, Yaqi Wang, Ruiquan Ge and Xiaojun Gong
Electronics 2025, 14(23), 4564; https://doi.org/10.3390/electronics14234564 - 21 Nov 2025
Viewed by 459
Abstract
Multiple ocular diseases frequently coexist in fundus images, while image quality is highly susceptible to imaging conditions and patient cooperation, often manifesting as blurring, underexposure, and indistinct lesion regions. These challenges significantly hinder robust multi-disease joint classification. To address this, we propose a [...] Read more.
Multiple ocular diseases frequently coexist in fundus images, while image quality is highly susceptible to imaging conditions and patient cooperation, often manifesting as blurring, underexposure, and indistinct lesion regions. These challenges significantly hinder robust multi-disease joint classification. To address this, we propose a novel framework, BiNeXt-SMSMVL (Bilateral ConvNeXt-based Structure-aware Multi-scale Multi-view Learning Network), that integrates structural medical biomarkers with deep semantic image features for robust multi-class fundus disease recognition. Specifically, we first employ automatic segmentation to extract the optic disc/cup and vascular structures, calculating medical biomarkers such as vertical/horizontal cup-to-disc ratio (CDR), vessel density, and fractal dimension as structural priors for classification. Simultaneously, a ConvNeXt-Tiny backbone extracts multi-scale visual features from raw fundus images, enhanced by SENet channel attention mechanisms to improve feature representation. Architecturally, the model performs independent predictions on left-eye, right-eye, and fused binocular images, leveraging multi-view ensembling to enhance decision stability. Structural priors and image features are then fused for joint classification modeling. Experiments on public datasets demonstrate that our model maintains stable performance under variable image quality and significant lesion heterogeneity, outperforming existing multi-label classification methods in key metrics including F1-score and AUC. Also, our approach exhibits strong robustness, interpretability, and clinical applicability. Full article
(This article belongs to the Section Artificial Intelligence)
Show Figures

Figure 1

16 pages, 2254 KB  
Article
Adaptive Multi-View Hypergraph Learning for Cross-Condition Bearing Fault Diagnosis
by Yangyi Li, Kyaw Hlaing Bwar, Rifai Chai, Kwong Ming Tse and Boon Xian Chai
Mach. Learn. Knowl. Extr. 2025, 7(4), 147; https://doi.org/10.3390/make7040147 - 15 Nov 2025
Cited by 1 | Viewed by 609
Abstract
Reliable bearing fault diagnosis across diverse operating conditions remains a fundamental challenge in intelligent maintenance. Traditional data-driven models often struggle to generalize due to the limited ability to represent complex and heterogeneous feature relationships. To address this issue, this paper presents an Adaptive [...] Read more.
Reliable bearing fault diagnosis across diverse operating conditions remains a fundamental challenge in intelligent maintenance. Traditional data-driven models often struggle to generalize due to the limited ability to represent complex and heterogeneous feature relationships. To address this issue, this paper presents an Adaptive Multi-view Hypergraph Learning (AMH) framework for cross-condition bearing fault diagnosis. The proposed approach first constructs multiple feature views from time-domain, frequency-domain, and time–frequency representations to capture complementary diagnostic information. Within each view, an adaptive hyperedge generation strategy is introduced to dynamically model high-order correlations by jointly considering feature similarity and operating condition relevance. The resulting hypergraph embeddings are then integrated through an attention-based fusion module that adaptively emphasizes the most informative views for fault classification. Extensive experiments on the Case Western Reserve University and Ottawa bearing datasets demonstrate that AMH consistently outperforms conventional graph-based and deep learning baselines in terms of classification precision, recall, and F1-score under cross-condition settings. The ablation studies further confirm the importance of adaptive hyperedge construction and attention-guided multi-view fusion in improving robustness and generalization. These results highlight the strong potential of the proposed framework for practical intelligent fault diagnosis in complex industrial environments. Full article
Show Figures

Figure 1

15 pages, 3459 KB  
Article
Multi-Granularity Invariant Structure Learning for Text Classification in Entrepreneurship Policy
by Xinyu Sun and Meifang Yao
Mathematics 2025, 13(22), 3648; https://doi.org/10.3390/math13223648 - 14 Nov 2025
Viewed by 483
Abstract
Data-driven text classification technology is crucial for understanding and managing a large number of entrepreneurial policy-related texts, yet it is hindered by two primary challenges. First, the intricate, multi-faceted nature of policy documents often leads to insufficient information extraction, as existing models struggle [...] Read more.
Data-driven text classification technology is crucial for understanding and managing a large number of entrepreneurial policy-related texts, yet it is hindered by two primary challenges. First, the intricate, multi-faceted nature of policy documents often leads to insufficient information extraction, as existing models struggle to synergistically leverage diverse information types, such as statistical regularities, linguistic structures, and external factual knowledge, resulting in semantic sparsity. Second, the performance of state-of-the-art deep learning models is heavily reliant on large-scale annotated data, a resource that is scarce and costly to acquire in entrepreneurial policy domains, rendering models susceptible to overfitting and poor generalization. To address these challenges, this paper proposes a Multi-granularity Invariant Structure Learning (MISL) model. Specifically, MISL first employs a multi-view feature engineering module that constructs and fuses distinct statistical, linguistic, and knowledge graphs to generate a comprehensive and rich semantic representation, thereby alleviating semantic sparsity. Furthermore, to enhance robustness and generalization from limited data, we introduce a dual invariant structure learning framework. This framework operates at two levels: (1) sample-invariant representation learning uses data augmentation and mutual information maximization to learn the essential semantic core of a text, invariant to superficial perturbations; (2) neighborhood-invariant semantic learning applies a contrastive objective on a nearest-neighbor graph to enforce intra-class compactness and inter-class separability in the feature space. Extensive experiments demonstrate that our proposed MISL model significantly outperforms state-of-the-art baselines, proving its effectiveness and robustness for classifying complex texts in entrepreneurial policy domains. Full article
(This article belongs to the Special Issue Artificial Intelligence and Data Science, 2nd Edition)
Show Figures

Figure 1

31 pages, 2985 KB  
Article
Heterogeneous Ensemble Sentiment Classification Model Integrating Multi-View Features and Dynamic Weighting
by Song Yang, Jiayao Xing, Zongran Dong and Zhaoxia Liu
Electronics 2025, 14(21), 4189; https://doi.org/10.3390/electronics14214189 - 27 Oct 2025
Viewed by 750
Abstract
With the continuous growth of user reviews, identifying underlying sentiment across multi-source texts efficiently and accurately has become a significant challenge in NLP. Traditional single models in cross-domain sentiment analysis often exhibit insufficient stability, limited generalization capabilities, and sensitivity to class imbalance. Existing [...] Read more.
With the continuous growth of user reviews, identifying underlying sentiment across multi-source texts efficiently and accurately has become a significant challenge in NLP. Traditional single models in cross-domain sentiment analysis often exhibit insufficient stability, limited generalization capabilities, and sensitivity to class imbalance. Existing ensemble methods predominantly rely on static weighting or voting strategies among homogeneous models, failing to fully leverage the complementary advantages between models. To address these issues, this study proposes a heterogeneous ensemble sentiment classification model integrating multi-view features and dynamic weighting. At the feature learning layer, the model constructs three complementary base learners, a RoBERTa-FC for extracting global semantic features, a BERT-BiGRU for capturing temporal dependencies, and a TextCNN-Attention for focusing on local semantic features, thereby achieving multi-level text representation. At the decision layer, a meta-learner is used to fuse multi-view features, and dynamic uncertainty weighting and attention weighting strategies are employed to adaptively adjust outputs from different base learners. Experimental results across multiple domains demonstrate that the proposed model consistently outperforms single learners and comparison methods in terms of Accuracy, Precision, Recall, F1 Score, and Macro-AUC. On average, the ensemble model achieves a Macro-AUC of 0.9582 ± 0.023 across five datasets, with an Accuracy of 0.9423, an F1 Score of 0.9590, and a Macro-AUC of 0.9797 on the AlY_ds dataset. Moreover, in cross-dataset ranking evaluation based on equally weighted metrics, the model consistently ranks within the top two, confirming its superior cross-domain adaptability and robustness. These findings highlight the effectiveness of the proposed framework in enhancing sentiment classification performance and provide valuable insights for future research on lightweight dynamic ensembles, multilingual, and multimodal applications. Full article
(This article belongs to the Section Artificial Intelligence)
Show Figures

Figure 1

23 pages, 1659 KB  
Article
A Multi-View-Based Federated Learning Approach for Intrusion Detection
by Jia Yu, Guoqiang Wang, Nianfeng Shi, Raghav Saxena and Brian Lee
Electronics 2025, 14(21), 4166; https://doi.org/10.3390/electronics14214166 - 24 Oct 2025
Viewed by 962
Abstract
Intrusion detection aims to identify the unauthorized activities within computer networks or systems by classifying events into normal or abnormal categories. As modern scenarios often involve multi-source data, multi-view fusion deep learning methods are employed to leverage diverse viewpoints for enhancing security threat [...] Read more.
Intrusion detection aims to identify the unauthorized activities within computer networks or systems by classifying events into normal or abnormal categories. As modern scenarios often involve multi-source data, multi-view fusion deep learning methods are employed to leverage diverse viewpoints for enhancing security threat detection. This paper introduces a novel intrusion detection approach using multi-view fusion within a federated learning framework, proposing an integrated AE Neural SVM (AE-NSVM) model that combines auto-encoder (AE) multi-view feature extraction and Support Vector Machine (SVM) classification. This approach simultaneously learns representative features from multiple views and classifies network samples into normal or seven attack categories while employing federated learning across clients to ensure adaptability and robustness in diverse network environments. The experimental results obtained from two benchmark datasets validate its superiority: on TON_IoT, the CAE-NSVM model achieves a highest F1-measure of 0.792 (1.4% higher than traditional pipeline systems); on UNSW-NB15, it delivers an F1-score of 0.829 with a 73% reduced training time and an 89% faster inference compared to baseline models. These results demonstrate the advantages of multi-view fusion in federated learning for balancing accuracy and efficiency in distributed intrusion detection systems. Full article
(This article belongs to the Special Issue Advances in Data Security: Challenges, Technologies, and Applications)
Show Figures

Figure 1

29 pages, 6329 KB  
Article
Non-Contact Measurement of Sunflower Flowerhead Morphology Using Mobile-Boosted Lightweight Asymmetric (MBLA)-YOLO and Point Cloud Technology
by Qiang Wang, Xinyuan Wei, Kaixuan Li, Boxin Cao and Wuping Zhang
Agriculture 2025, 15(21), 2180; https://doi.org/10.3390/agriculture15212180 - 22 Oct 2025
Viewed by 688
Abstract
The diameter of the sunflower flower head and the thickness of its margins are important crop phenotypic parameters. Traditional, single-dimensional two-dimensional imaging methods often struggle to balance precision with computational efficiency. This paper addresses the limitations of the YOLOv11n-seg model in the instance [...] Read more.
The diameter of the sunflower flower head and the thickness of its margins are important crop phenotypic parameters. Traditional, single-dimensional two-dimensional imaging methods often struggle to balance precision with computational efficiency. This paper addresses the limitations of the YOLOv11n-seg model in the instance segmentation of floral disk fine structures by proposing the MBLA-YOLO instance segmentation model, achieving both lightweight efficiency and high accuracy. Building upon this foundation, a non-contact measurement method is proposed that combines an improved model with three-dimensional point cloud analysis to precisely extract key structural parameters of the flower head. First, image annotation is employed to eliminate interference from petals and sepals, whilst instance segmentation models are used to delineate the target region; The segmentation results for the disc surface (front) and edges (sides) are then mapped onto the three-dimensional point cloud space. Target regions are extracted, and following processing, separate models are constructed for the disc surface and edges. Finally, with regard to the differences between the surface and edge structures, targeted methods are employed for their respective calculations. Whilst maintaining lightweight characteristics, the proposed MBLA-YOLO model achieves simultaneous improvements in accuracy and efficiency compared to the baseline YOLOv11n-seg. The introduced CKMB backbone module enhances feature modelling capabilities for complex structural details, whilst the LADH detection head improves small object recognition and boundary segmentation accuracy. Specifically, the CKMB module integrates MBConv and channel attention to strengthen multi-scale feature extraction and representation, while the LADH module adopts a tri-branch design for classification, regression, and IoU prediction, structurally improving detection precision and boundary recognition. This research not only demonstrates superior accuracy and robustness but also significantly reduces computational overhead, thereby achieving an excellent balance between model efficiency and measurement precision. This method avoids the need for three-dimensional reconstruction of the entire plant and multi-view point cloud registration, thereby reducing data redundancy and computational resource expenditure. Full article
(This article belongs to the Section Artificial Intelligence and Digital Agriculture)
Show Figures

Figure 1

22 pages, 1806 KB  
Article
MAMVCL: Multi-Atlas Guided Multi-View Contrast Learning for Autism Spectrum Disorder Classification
by Zuohao Yin, Feng Xu, Yue Ma, Shuo Huang, Kai Ren and Li Zhang
Brain Sci. 2025, 15(10), 1086; https://doi.org/10.3390/brainsci15101086 - 8 Oct 2025
Cited by 1 | Viewed by 632
Abstract
Background: Autism spectrum disorder (ASD) is a neurodevelopmental condition characterized by significant neurological plasticity in early childhood, where timely interventions like behavioral therapy, language training, and social skills development can mitigate symptoms. Contributions: We introduce a novel Multi-Atlas Guided Multi-View Contrast Learning (MAMVCL) [...] Read more.
Background: Autism spectrum disorder (ASD) is a neurodevelopmental condition characterized by significant neurological plasticity in early childhood, where timely interventions like behavioral therapy, language training, and social skills development can mitigate symptoms. Contributions: We introduce a novel Multi-Atlas Guided Multi-View Contrast Learning (MAMVCL) framework for ASD classification, leveraging functional connectivity (FC) matrices from multiple brain atlases to enhance diagnostic accuracy. Methodology: The MAMVCL framework integrates imaging and phenotypic data through a population graph, where node features derive from imaging data, edge indices are based on similarity scoring matrices, and edge weights reflect phenotypic similarities. Graph convolution extracts global field-of-view features. Concurrently, a Target-aware attention aggregator processes FC matrices to capture high-order brain region dependencies, yielding local field-of-view features. To ensure consistency in subject characteristics, we employ a graph contrastive learning strategy that aligns global and local feature representations. Results: Experimental results on the ABIDE-I dataset demonstrate that our model achieves an accuracy of 85.71%, outperforming most existing methods and confirming its effectiveness. Implications: The proposed model demonstrates superior performance in ASD classification, highlighting the potential of multi-atlas and multi-view learning for improving diagnostic precision and supporting early intervention strategies. Full article
(This article belongs to the Special Issue Advances in Emotion Processing and Cognitive Neuropsychology)
Show Figures

Figure 1

23 pages, 5437 KB  
Article
Hierarchical Deep Learning for Abnormality Classification in Mouse Skeleton Using Multiview X-Ray Images: Convolutional Autoencoders Versus ConvNeXt
by Muhammad M. Jawaid, Rasneer S. Bains, Sara Wells and James M. Brown
J. Imaging 2025, 11(10), 348; https://doi.org/10.3390/jimaging11100348 - 7 Oct 2025
Viewed by 675
Abstract
Single-view-based anomaly detection approaches present challenges due to the lack of context, particularly for multi-label problems. In this work, we demonstrate the efficacy of using multiview image data for improved classification using a hierarchical learning approach. Using 170,958 images from the International Mouse [...] Read more.
Single-view-based anomaly detection approaches present challenges due to the lack of context, particularly for multi-label problems. In this work, we demonstrate the efficacy of using multiview image data for improved classification using a hierarchical learning approach. Using 170,958 images from the International Mouse Phenotyping Consortium (IMPC) repository, a specimen-wise multiview dataset comprising 54,046 specimens was curated. Next, two hierarchical classification frameworks were developed by customizing ConvNeXT and a convolutional autoencoder (CAE) as CNN backbones, respectively. The customized architectures were trained at three hierarchy levels with increasing anatomical granularity, enabling specialized layers to learn progressively more detailed features. At the top level (L1), multiview (MV) classification performed about the same as single views, with a high mean AUC of 0.95. However, using MV images in the hierarchical model greatly improved classification at levels 2 and 3. The model showed consistently higher average AUC scores with MV compared to single views such as dorsoventral or lateral. For example, at Level 2 (L2), the model divided abnormal cases into three subclasses, achieving AUCs of 0.65 for DV, 0.76 for LV, and 0.87 for MV. Then, at Level 3 (L3), it further divided these into ten specific abnormalities, with AUCs of 0.54 for DV, 0.59 for LV, and 0.82 for MV. A similar performance was achieved by the CAE-driven architecture, with mean AUCs of 0.87, 0.88, and 0.89 at Level 2 (L2) and 0.74, 0.78, and 0.81 at Level 3 (L3), respectively, for DV, LV, and MV views. The overall results demonstrate the advantage of multiview image data coupled with hierarchical learning for skeletal abnormality detection in a multi-label context. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

Back to TopTop