Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (13)

Search Parameters:
Keywords = label-related feature redundancy

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
23 pages, 13739 KB  
Article
Traffic Accident Rescue Action Recognition Method Based on Real-Time UAV Video
by Bo Yang, Jianan Lu, Tao Liu, Bixing Zhang, Chen Geng, Yan Tian and Siyu Zhang
Drones 2025, 9(8), 519; https://doi.org/10.3390/drones9080519 - 24 Jul 2025
Viewed by 1788
Abstract
Low-altitude drones, which are unimpeded by traffic congestion or urban terrain, have become a critical asset in emergency rescue missions. To address the current lack of emergency rescue data, UAV aerial videos were collected to create an experimental dataset for action classification and [...] Read more.
Low-altitude drones, which are unimpeded by traffic congestion or urban terrain, have become a critical asset in emergency rescue missions. To address the current lack of emergency rescue data, UAV aerial videos were collected to create an experimental dataset for action classification and localization annotation. A total of 5082 keyframes were labeled with 1–5 targets each, and 14,412 instances of data were prepared (including flight altitude and camera angles) for action classification and position annotation. To mitigate the challenges posed by high-resolution drone footage with excessive redundant information, we propose the SlowFast-Traffic (SF-T) framework, a spatio-temporal sequence-based algorithm for recognizing traffic accident rescue actions. For more efficient extraction of target–background correlation features, we introduce the Actor-Centric Relation Network (ACRN) module, which employs temporal max pooling to enhance the time-dimensional features of static backgrounds, significantly reducing redundancy-induced interference. Additionally, smaller ROI feature map outputs are adopted to boost computational speed. To tackle class imbalance in incident samples, we integrate a Class-Balanced Focal Loss (CB-Focal Loss) function, effectively resolving rare-action recognition in specific rescue scenarios. We replace the original Faster R-CNN with YOLOX-s to improve the target detection rate. On our proposed dataset, the SF-T model achieves a mean average precision (mAP) of 83.9%, which is 8.5% higher than that of the standard SlowFast architecture while maintaining a processing speed of 34.9 tasks/s. Both accuracy-related metrics and computational efficiency are substantially improved. The proposed method demonstrates strong robustness and real-time analysis capabilities for modern traffic rescue action recognition. Full article
Show Figures

Figure 1

33 pages, 7056 KB  
Article
Semi-Supervised Attribute Selection Algorithms for Partially Labeled Multiset-Valued Data
by Yuanzi He, Jiali He, Haotian Liu and Zhaowen Li
Mathematics 2025, 13(8), 1318; https://doi.org/10.3390/math13081318 - 17 Apr 2025
Cited by 1 | Viewed by 682
Abstract
In machine learning, when the labeled portion of data needs to be processed, a semi-supervised learning algorithm is used. A dataset with missing attribute values or labels is referred to as an incomplete information system. Addressing incomplete information within a system poses a [...] Read more.
In machine learning, when the labeled portion of data needs to be processed, a semi-supervised learning algorithm is used. A dataset with missing attribute values or labels is referred to as an incomplete information system. Addressing incomplete information within a system poses a significant challenge, which can be effectively tackled through the application of rough set theory (R-theory). However, R-theory has its limits: It fails to consider the frequency of an attribute value and then cannot the distribution of attribute values appropriately. If we consider partially labeled data and replace a missing attribute value with the multiset of all possible attribute values under the same attribute, this results in the emergence of partially labeled multiset-valued data. In a semi-supervised learning algorithm, in order to save time and costs, a large number of redundant features need to be deleted. This study proposes semi-supervised attribute selection algorithms for partially labeled multiset-valued data. Initially, a partially labeled multiset-valued decision information system (p-MSVDIS) is partitioned into two distinct systems: a labeled multiset-valued decision information system (l-MSVDIS) and an unlabeled multiset-valued decision information system (u-MSVDIS). Subsequently, using the indistinguishable relation, distinguishable relation, and dependence function, two types of attribute subset importance in a p-MSVDIS are defined: the weighted sum of l-MSVDIS and u-MSVDIS determined by the missing rate of labels, which can be considered an uncertainty measurement (UM) of a p-MSVDIS. Next, two adaptive semi-supervised attribute selection algorithms for a p-MSVDIS are introduced, which leverage the degrees of importance, allowing for automatic adaptation to diverse missing rates. Finally, experiments and statistical analyses are conducted on 11 datasets. The outcome indicates that the proposed algorithms demonstrate advantages over certain algorithms. Full article
Show Figures

Figure 1

22 pages, 1599 KB  
Article
Single-Stage Entity–Relation Joint Extraction of Pesticide Registration Information Based on HT-BES Multi-Dimensional Labeling Strategy
by Chenyang Dong, Shiyu Xi, Yinchao Che, Shufeng Xiong, Xinming Ma, Lei Xi and Shuping Xiong
Algorithms 2024, 17(12), 559; https://doi.org/10.3390/a17120559 - 6 Dec 2024
Viewed by 1001
Abstract
Pesticide registration information is an essential part of the pesticide knowledge base. However, the large amount of unstructured text data that it contains pose significant challenges for knowledge storage, retrieval, and utilization. To address the characteristics of pesticide registration text such as high [...] Read more.
Pesticide registration information is an essential part of the pesticide knowledge base. However, the large amount of unstructured text data that it contains pose significant challenges for knowledge storage, retrieval, and utilization. To address the characteristics of pesticide registration text such as high information density, complex logical structures, large spans between entities, and heterogeneous entity lengths, as well as to overcome the challenges faced when using traditional joint extraction methods, including triplet overlap, exposure bias, and redundant computation, we propose a single-stage entity–relation joint extraction model based on HT-BES multi-dimensional labeling (MD-SERel). First, in the encoding layer, to address the complex structural characteristics of pesticide registration texts, we employ RoBERTa combined with a multi-head self-attention mechanism to capture the deep semantic features of the text. Simultaneously, syntactic features are extracted using a syntactic dependency tree and graph neural networks to enhance the model’s understanding of text structure. Subsequently, we integrate semantic and syntactic features, enriching the character vector representations and thus improving the model’s ability to represent complex textual data. Secondly, in the multi-dimensional labeling framework layer, we use HT-BES multi-dimensional labeling, where the model assigns multiple labels to each character. These labels include entity boundaries, positions, and head–tail entity association information, which naturally resolves overlapping triplets. Through utilizing a parallel scoring function and fine-grained classification components, the joint extraction of entities and relations is transformed into a multi-label sequence labeling task based on relation dimensions. This process does not involve interdependent steps, thus enabling single-stage parallel labeling, preventing exposure bias and reducing computational redundancy. Finally, in the decoding layer, entity–relation triplets are decoded based on the predicted labels from the fine-grained classification. The experimental results demonstrate that the MD-SERel model performs well on both the Pesticide Registration Dataset (PRD) and the general DuIE dataset. On the PRD, compared to the optimal baseline model, the training time is 1.2 times faster, the inference time is 1.2 times faster, and the F1 score is improved by 1.5%, demonstrating its knowledge extraction capabilities in pesticide registration documents. On the DuIE dataset, the MD-SERel model also achieved better results compared to the baseline, demonstrating its strong generalization ability. These findings will provide technical support for the construction of pesticide knowledge bases. Full article
(This article belongs to the Special Issue Algorithms for Feature Selection (3rd Edition))
Show Figures

Figure 1

20 pages, 1971 KB  
Article
A Patch-Level Region-Aware Module with a Multi-Label Framework for Remote Sensing Image Captioning
by Yunpeng Li, Xiangrong Zhang, Tianyang Zhang, Guanchun Wang, Xinlin Wang and Shuo Li
Remote Sens. 2024, 16(21), 3987; https://doi.org/10.3390/rs16213987 - 27 Oct 2024
Cited by 3 | Viewed by 2071
Abstract
Recent Transformer-based works can generate high-quality captions for remote sensing images (RSIs). However, these methods generally feed global or grid visual features to a Transformer-based captioning model for associating cross-modal information, which limits performance. In this work, we investigate unexplored ideas for remote [...] Read more.
Recent Transformer-based works can generate high-quality captions for remote sensing images (RSIs). However, these methods generally feed global or grid visual features to a Transformer-based captioning model for associating cross-modal information, which limits performance. In this work, we investigate unexplored ideas for remote sensing image captioning task, using a novel patch-level region-aware module with a multi-label framework. Due to an overhead perspective and a significantly larger scale in RSIs, a patch-level region-aware module is designed to filter the redundant information in the RSI scene, which benefits the Transformer-based decoder by attaining improved image perception. Technically, the trainable multi-label classifier capitalizes on semantic features as supplementary to the region-aware features. Moreover, modeling the inner relations of inputs is essential for understanding the RSI. Thus, we introduce region-oriented attention, which associates region features and semantic labels, omits the irrelevant regions to highlight relevant regions, and learns related semantic information. Extensive qualitative and quantitative experimental results show the superiority of our approach on the RSICD, UCM-Captions, and Sydney-Captions. The code for our method will be publicly available. Full article
Show Figures

Figure 1

23 pages, 2148 KB  
Article
MRG-T: Mask-Relation-Guided Transformer for Remote Vision-Based Pedestrian Attribute Recognition in Aerial Imagery
by Shun Zhang, Yupeng Li, Xiao Wu, Zunheng Chu and Lingfei Li
Remote Sens. 2024, 16(7), 1216; https://doi.org/10.3390/rs16071216 - 29 Mar 2024
Cited by 2 | Viewed by 2185
Abstract
Nowadays, with the rapid development of consumer Unmanned Aerial Vehicles (UAVs), utilizing UAV platforms for visual surveillance has become very attractive, and a key part of this is remote vision-based pedestrian attribute recognition. Pedestrian Attribute Recognition (PAR) is dedicated to predicting multiple attribute [...] Read more.
Nowadays, with the rapid development of consumer Unmanned Aerial Vehicles (UAVs), utilizing UAV platforms for visual surveillance has become very attractive, and a key part of this is remote vision-based pedestrian attribute recognition. Pedestrian Attribute Recognition (PAR) is dedicated to predicting multiple attribute labels of a single pedestrian image extracted from surveillance videos and aerial imagery, which presents significant challenges in the computer vision community due to factors such as poor imaging quality and substantial pose variations. Despite recent studies demonstrating impressive advancements in utilizing complicated architectures and exploring relations, most of them may fail to fully and systematically consider the inter-region, inter-attribute, and region-attribute mapping relations simultaneously and be stuck in the dilemma of information redundancy, leading to the degradation of recognition accuracy. To address the issues, we construct a novel Mask-Relation-Guided Transformer (MRG-T) framework that consists of three relation modeling modules to fully exploit spatial and semantic relations in the model learning process. Specifically, we first propose a Masked Region Relation Module (MRRM) to focus on precise spatial attention regions to extract more robust features with masked random patch training. To explore the semantic association of attributes, we further present a Masked Attribute Relation Module (MARM) to extract intrinsic and semantic inter-attribute relations with an attribute label masking strategy. Based on the cross-attention mechanism, we finally design a Region and Attribute Mapping Module (RAMM) to learn the cross-modal alignment between spatial regions and semantic attributes. We conduct comprehensive experiments on three public benchmarks such as PETA, PA-100K, and RAPv1, and conduct inference on a large-scale airborne person dataset named PRAI-1581. The extensive experimental results demonstrate the superior performance of our method compared to state-of-the-art approaches and validate the effectiveness of mask-relation-guided modeling in the remote vision-based PAR task. Full article
(This article belongs to the Special Issue Signal Processing Theory and Methods in Remote Sensing)
Show Figures

Figure 1

20 pages, 1050 KB  
Article
Joint Entity and Relation Extraction Model Based on Inner and Outer Tensor Dot Product and Single-Table Filling
by Ping Feng, Lin Yang, Boning Zhang, Renjie Wang and Dantong Ouyang
Appl. Sci. 2024, 14(4), 1334; https://doi.org/10.3390/app14041334 - 6 Feb 2024
Cited by 2 | Viewed by 2282
Abstract
Joint relational triple extraction is a crucial step in constructing a knowledge graph from unstructured text. Recently, multiple methods have been proposed for extracting relationship triplets. Notably, end-to-end table-filling methods have garnered significant research interest due to their efficient extraction capabilities. However, existing [...] Read more.
Joint relational triple extraction is a crucial step in constructing a knowledge graph from unstructured text. Recently, multiple methods have been proposed for extracting relationship triplets. Notably, end-to-end table-filling methods have garnered significant research interest due to their efficient extraction capabilities. However, existing approaches usually generate separate tables for each relationship, which neglects the global correlation between relationships and context, producing a large number of useless blank tables. This problem results in issues of redundant information and sample imbalance. To address these challenges, we propose a novel framework for joint entity and relation extraction based on a single-table filling method. This method incorporates all relationships as prompts within the text sequence and associates entity span information with relationship labels. This approach reduces the generation of redundant information and enhances the extraction capability for overlapping triplets. We utilize the internal and external multi-head tensor fusion approach to generate two sets of table feature vectors. These vectors are subsequently merged to capture a wider range of global information. Experimental results on the NYT and WebNLG datasets demonstrate the effectiveness of our proposed model, which maintains excellent performance, even in complex scenarios involving overlapping triplets. Full article
Show Figures

Figure 1

23 pages, 7472 KB  
Article
Real-Time Wildfire Detection Algorithm Based on VIIRS Fire Product and Himawari-8 Data
by Da Zhang, Chunlin Huang, Juan Gu, Jinliang Hou, Ying Zhang, Weixiao Han, Peng Dou and Yaya Feng
Remote Sens. 2023, 15(6), 1541; https://doi.org/10.3390/rs15061541 - 11 Mar 2023
Cited by 22 | Viewed by 9711
Abstract
Wildfires have a significant impact on the atmosphere, terrestrial ecosystems, and society. Real-time monitoring of wildfire locations is crucial in fighting wildfires and reducing human casualties and property damage. Geostationary satellites offer the advantage of high temporal resolution and are gradually being used [...] Read more.
Wildfires have a significant impact on the atmosphere, terrestrial ecosystems, and society. Real-time monitoring of wildfire locations is crucial in fighting wildfires and reducing human casualties and property damage. Geostationary satellites offer the advantage of high temporal resolution and are gradually being used for real-time fire detection. In this study, we constructed a fire label dataset using the stable VNP14IMG fire product and used the random forest (RF) model for fire detection based on Himawari-8 multiband data. The band calculation features related brightness temperature, spatial features, and auxiliary data as input used in this framework for model training. We also used a recursive feature elimination method to evaluate the impact of these features on model accuracy and to exclude redundant features. The daytime and nighttime RF models (RF-D/RF-N) are separately constructed to analyze their applicability. Finally, we extensively evaluated the model performance by comparing them with the Japan Aerospace Exploration Agency (JAXA) wildfire product. The RF models exhibited higher accuracy, with recall and precision rates of 95.62% and 59%, respectively, and the recall rate for small fires was 19.44% higher than that of the JAXA wildfire product. Adding band calculation features and spatial features, as well as feature selection, effectively reduced the overfitting and improved the model’s generalization ability. The RF-D model had higher fire detection accuracy than the RF-N model. Omission errors and commission errors were mainly concentrated in the adjacent pixels of the fire clusters. In conclusion, our VIIRS fire product and Himawari-8 data-based fire detection model can monitor the fire location in real time and has excellent detection capability for small fires, making it highly significant for fire detection. Full article
(This article belongs to the Topic Application of Remote Sensing in Forest Fire)
Show Figures

Graphical abstract

21 pages, 8362 KB  
Article
AI-TFNet: Active Inference Transfer Convolutional Fusion Network for Hyperspectral Image Classification
by Jianing Wang, Linhao Li, Yichen Liu, Jinyu Hu, Xiao Xiao and Bo Liu
Remote Sens. 2023, 15(5), 1292; https://doi.org/10.3390/rs15051292 - 26 Feb 2023
Cited by 6 | Viewed by 2755
Abstract
The realization of efficient classification with limited labeled samples is a critical task in hyperspectral image classification (HSIC). Convolutional neural networks (CNNs) have achieved remarkable advances while considering spectral–spatial features simultaneously, while conventional patch-wise-based CNNs usually lead to redundant computations. Therefore, in this [...] Read more.
The realization of efficient classification with limited labeled samples is a critical task in hyperspectral image classification (HSIC). Convolutional neural networks (CNNs) have achieved remarkable advances while considering spectral–spatial features simultaneously, while conventional patch-wise-based CNNs usually lead to redundant computations. Therefore, in this paper, we established a novel active inference transfer convolutional fusion network (AI-TFNet) for HSI classification. First, in order to reveal and merge the local low-level and global high-level spectral–spatial contextual features at different stages of extraction, an end-to-end fully hybrid multi-stage transfer fusion network (TFNet) was designed to improve classification performance and efficiency. Meanwhile, an active inference (AI) pseudo-label propagation algorithm for spatially homogeneous samples was constructed using the homogeneous pre-segmentation of the proposed TFNet. In addition, a confidence-augmented pseudo-label loss (CapLoss) was proposed in order to define the confidence of a pseudo-label with an adaptive threshold in homogeneous regions for acquiring pseudo-label samples; this can adaptively infer a pseudo-label by actively augmenting the homogeneous training samples based on their spatial homogeneity and spectral continuity. Experiments on three real HSI datasets proved that the proposed method had competitive performance and efficiency compared to several related state-of-the-art methods. Full article
(This article belongs to the Special Issue Active Learning Methods for Remote Sensing Data Processing)
Show Figures

Figure 1

14 pages, 1404 KB  
Article
Specific Relation Attention-Guided Graph Neural Networks for Joint Entity and Relation Extraction in Chinese EMR
by Yali Pang, Xiaohui Qin and Zhichang Zhang
Appl. Sci. 2022, 12(17), 8493; https://doi.org/10.3390/app12178493 - 25 Aug 2022
Cited by 6 | Viewed by 2586
Abstract
Electronic medical records (EMRs) contain a variety of valuable medical entities and their relations. The extraction of medical entities and their relations has important application value in the structuring of EMR and the development of various types of intelligent assistant medical systems, and [...] Read more.
Electronic medical records (EMRs) contain a variety of valuable medical entities and their relations. The extraction of medical entities and their relations has important application value in the structuring of EMR and the development of various types of intelligent assistant medical systems, and hence is a hot issue in intelligent medicine research. In recent years, most research aims to firstly identify entities and then to recognize the relations between the entities, and often suffers from many redundant operations. Furthermore, the challenge remains of identifying overlapping relation triplets along with the entire medical entity boundary and detecting multi-type relations. In this work, we propose a Specific Relation Attention-guided Graph Neural Networks (SRAGNNs) model to jointly extract entities and their relations in Chinese EMR, which uses sentence information and attention-guided graph neural networks to perceive the features of every relation in a sentence and then to extract those relations. In addition, a specific sentence representation is constructed for every relation, and sequence labeling is performed to extract its corresponding head and tail entities. Experiments on a medical evaluation dataset and a manually labeled Chinese EMR dataset show that our model improves the performance of Chinese medical entities and relation extraction. Full article
(This article belongs to the Special Issue Advanced Machine Learning in Medical Informatics)
Show Figures

Figure 1

23 pages, 717 KB  
Article
A Multi-View Framework to Detect Redundant Activity Labels for More Representative Event Logs in Process Mining
by Qifan Chen, Yang Lu, Charmaine S. Tam and Simon K. Poon
Future Internet 2022, 14(6), 181; https://doi.org/10.3390/fi14060181 - 9 Jun 2022
Cited by 5 | Viewed by 2899
Abstract
Process mining aims to gain knowledge of business processes via the discovery of process models from event logs generated by information systems. The insights revealed from process mining heavily rely on the quality of the event logs. Activities extracted from different data sources [...] Read more.
Process mining aims to gain knowledge of business processes via the discovery of process models from event logs generated by information systems. The insights revealed from process mining heavily rely on the quality of the event logs. Activities extracted from different data sources or the free-text nature within the same system may lead to inconsistent labels. Such inconsistency would then lead to redundancy in activity labels, which refer to labels that have different syntax but share the same behaviours. Redundant activity labels can introduce unnecessary complexities to the event logs. The identification of these labels from data-driven process discovery are difficult and rely heavily on human intervention. Neither existing process discovery algorithms nor event data preprocessing techniques can solve such redundancy efficiently. In this paper, we propose a multi-view approach to automatically detect redundant activity labels by using not only context-aware features such as control–flow relations and attribute values but also semantic features from the event logs. Our evaluation of several publicly available datasets and a real-life case study demonstrate that our approach can efficiently detect redundant activity labels even with low-occurrence frequencies. The proposed approach can add value to the preprocessing step to generate more representative event logs. Full article
(This article belongs to the Special Issue Trends of Data Science and Knowledge Discovery)
Show Figures

Figure 1

21 pages, 2533 KB  
Article
Multi-Label Feature Selection Combining Three Types of Conditional Relevance
by Lingbo Gao, Yiqiang Wang, Yonghao Li, Ping Zhang and Liang Hu
Entropy 2021, 23(12), 1617; https://doi.org/10.3390/e23121617 - 1 Dec 2021
Cited by 1 | Viewed by 2874
Abstract
With the rapid growth of the Internet, the curse of dimensionality caused by massive multi-label data has attracted extensive attention. Feature selection plays an indispensable role in dimensionality reduction processing. Many researchers have focused on this subject based on information theory. Here, to [...] Read more.
With the rapid growth of the Internet, the curse of dimensionality caused by massive multi-label data has attracted extensive attention. Feature selection plays an indispensable role in dimensionality reduction processing. Many researchers have focused on this subject based on information theory. Here, to evaluate feature relevance, a novel feature relevance term (FR) that employs three incremental information terms to comprehensively consider three key aspects (candidate features, selected features, and label correlations) is designed. A thorough examination of the three key aspects of FR outlined above is more favorable to capturing the optimal features. Moreover, we employ label-related feature redundancy as the label-related feature redundancy term (LR) to reduce unnecessary redundancy. Therefore, a designed multi-label feature selection method that integrates FR with LR is proposed, namely, Feature Selection combining three types of Conditional Relevance (TCRFS). Numerous experiments indicate that TCRFS outperforms the other 6 state-of-the-art multi-label approaches on 13 multi-label benchmark data sets from 4 domains. Full article
(This article belongs to the Topic Machine and Deep Learning)
Show Figures

Figure 1

16 pages, 1577 KB  
Article
Discriminable Multi-Label Attribute Selection for Pre-Course Student Performance Prediction
by Jie Yang, Shimin Hu, Qichao Wang and Simon Fong
Entropy 2021, 23(10), 1252; https://doi.org/10.3390/e23101252 - 26 Sep 2021
Cited by 4 | Viewed by 2829
Abstract
The university curriculum is a systematic and organic study complex with some immediate associated steps; the initial learning of each semester’s course is crucial, and significantly impacts the learning process of subsequent courses and further studies. However, the low teacher–student ratio makes it [...] Read more.
The university curriculum is a systematic and organic study complex with some immediate associated steps; the initial learning of each semester’s course is crucial, and significantly impacts the learning process of subsequent courses and further studies. However, the low teacher–student ratio makes it difficult for teachers to consistently follow up on the detail-oriented learning situation of individual students. The extant learning early warning system is committed to automatically detecting whether students have potential difficulties—or even the risk of failing, or non-pass reports—before starting the course. Previous related research has the following three problems: first of all, it mainly focused on e-learning platforms and relied on online activity data, which was not suitable for traditional teaching scenarios; secondly, most current methods can only proffer predictions when the course is in progress, or even approaching the end; thirdly, few studies have focused on the feature redundancy in these learning data. Aiming at the traditional classroom teaching scenario, this paper transforms the pre-class student performance prediction problem into a multi-label learning model, and uses the attribute reduction method to scientifically streamline the characteristic information of the courses taken and explore the important relationship between the characteristics of the previously learned courses and the attributes of the courses to be taken, in order to detect high-risk students in each course before the course begins. Extensive experiments were conducted on 10 real-world datasets, and the results proved that the proposed approach achieves better performance than most other advanced methods in multi-label classification evaluation metrics. Full article
(This article belongs to the Special Issue Information-Theoretic Data Mining)
Show Figures

Figure 1

24 pages, 1524 KB  
Article
Multi-Label Feature Selection Based on High-Order Label Correlation Assumption
by Ping Zhang, Wanfu Gao, Juncheng Hu and Yonghao Li
Entropy 2020, 22(7), 797; https://doi.org/10.3390/e22070797 - 21 Jul 2020
Cited by 26 | Viewed by 4877
Abstract
Multi-label data often involve features with high dimensionality and complicated label correlations, resulting in a great challenge for multi-label learning. Feature selection plays an important role in multi-label learning to address multi-label data. Exploring label correlations is crucial for multi-label feature selection. Previous [...] Read more.
Multi-label data often involve features with high dimensionality and complicated label correlations, resulting in a great challenge for multi-label learning. Feature selection plays an important role in multi-label learning to address multi-label data. Exploring label correlations is crucial for multi-label feature selection. Previous information-theoretical-based methods employ the strategy of cumulative summation approximation to evaluate candidate features, which merely considers low-order label correlations. In fact, there exist high-order label correlations in label set, labels naturally cluster into several groups, similar labels intend to cluster into the same group, different labels belong to different groups. However, the strategy of cumulative summation approximation tends to select the features related to the groups containing more labels while ignoring the classification information of groups containing less labels. Therefore, many features related to similar labels are selected, which leads to poor classification performance. To this end, Max-Correlation term considering high-order label correlations is proposed. Additionally, we combine the Max-Correlation term with feature redundancy term to ensure that selected features are relevant to different label groups. Finally, a new method named Multi-label Feature Selection considering Max-Correlation (MCMFS) is proposed. Experimental results demonstrate the classification superiority of MCMFS in comparison to eight state-of-the-art multi-label feature selection methods. Full article
(This article belongs to the Section Information Theory, Probability and Statistics)
Show Figures

Figure 1

Back to TopTop