Next Article in Journal
An Intelligent Spam Detection Model Based on Artificial Immune System
Previous Article in Journal
Privacy-Aware MapReduce Based Multi-Party Secure Skyline Computation
Article Menu

Article Versions

Export Article

Open AccessArticle

Latent Feature Group Learning for High-Dimensional Data Clustering

1
Big Data Institute, College of Computer Science and Software Engineering, Shenzhen University, Shenzhen 518060, China
2
National Engineering Laboratory for Big Data System Computing Technology, Shenzhen University, Shenzhen 518060, China
3
School of Computer Science, McGill University, Quebec, H3A OG4, Canada
*
Author to whom correspondence should be addressed.
Information 2019, 10(6), 208; https://doi.org/10.3390/info10060208
Received: 1 April 2019 / Revised: 17 May 2019 / Accepted: 6 June 2019 / Published: 10 June 2019
(This article belongs to the Section Artificial Intelligence)
PDF [2608 KB, uploaded 10 June 2019]
  |  

Abstract

In this paper, we propose a latent feature group learning (LFGL) algorithm to discover the
feature grouping structures and subspace clusters for high-dimensional data. The feature grouping
structures, which are learned in an analytical way, can enhance the accuracy and efficiency of
high-dimensional data clustering. In LFGL algorithm, the Darwinian evolutionary process is used
to explore the optimal feature grouping structures, which are coded as chromosomes in the genetic
algorithm. The feature grouping weighting k-means algorithm is used as the fitness function to
evaluate the chromosomes or feature grouping structures in each generation of evolution. To better
handle the diverse densities of clusters in high-dimensional data, the original feature grouping
weighting k-means is revised with the mass-based dissimilarity measure rather than the Euclidean
distance measure and the feature weights are optimized as a nonnegative matrix factorization
problem under the orthogonal constraint of feature weight matrix. The genetic operations of mutation
and crossover are used to generate the new chromosomes for next generation. In comparison
with the well-known clustering algorithms, LFGL algorithm produced encouraging experimental
results on real world datasets, which demonstrated the better performance of LFGL when clustering
high-dimensional data.
Keywords: subspace clustering; feature grouping; genetic algorithm; high-dimensional data analysis; evolutionary computing subspace clustering; feature grouping; genetic algorithm; high-dimensional data analysis; evolutionary computing
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).
SciFeed

Share & Cite This Article

MDPI and ACS Style

Wang, W.; He, Y.; Ma, L.; Huang, J.Z.Z. Latent Feature Group Learning for High-Dimensional Data Clustering. Information 2019, 10, 208.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Information EISSN 2078-2489 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top