Recent Developments in Clustering and Classification Methods

A special issue of Stats (ISSN 2571-905X).

Deadline for manuscript submissions: closed (31 May 2022) | Viewed by 12831

Special Issue Editors


E-Mail Website
Guest Editor
Dipartimento di matematica, Università di Genova, 16100 Genova, Italy
Interests: clustering and classification; mixture models; multivariate dependence models with copula

E-Mail Website
Guest Editor
Serra Húnter fellow, Department of Statistics and Operations Research, Universitat Politècnica de Catalunya-BarcelonaTech, 08034 Barcelona, Spain
Interests: biostatistics; categorical data analysis; clustering and classification; computational statistics; mixture models; goodness-of-fit tests; statistical modeling

Special Issue Information

Dear colleagues,

It is our pleasure to announce a Special Issue on “Recent Developments in Clustering and Classification Methods”. This Special Issue aims at providing a focus on state-of-the-art research in the field of statistical models and learning techniques in different areas of unsupervised and semi-supervised classification. 

Topics of particular interest may include but are not limited to:

  • Development of methodological innovations in all fields of classification and clustering, such as model-based clustering, mixture models for both continuous, discrete, and mixed data in one or two dimensions (co-clustering, biclustering), robust approaches for data classification, and clustering time series, among others;
  • New trends in visualization tools;
  • Development of new approaches for model selection and goodness-of-fit tests in clustering and classification methods;
  • New solutions for dealing with missing data in clustering and classification methods.

We look forward to receiving your submissions.

Sincerely,

Dr. Marta Nai Ruscone
Dr. Daniel Fernández
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Stats is an international peer-reviewed open access quarterly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • classification and clustering
  • graphical tools
  • goodness-of-fit test
  • mixture models
  • model selection
  • missing data
  • unsupervised and semi-supervised classification

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (4 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

11 pages, 560 KiB  
Article
Spectral Clustering of Mixed-Type Data
by Felix Mbuga and Cristina Tortora
Stats 2022, 5(1), 1-11; https://doi.org/10.3390/stats5010001 - 23 Dec 2021
Cited by 10 | Viewed by 4765
Abstract
Cluster analysis seeks to assign objects with similar characteristics into groups called clusters so that objects within a group are similar to each other and dissimilar to objects in other groups. Spectral clustering has been shown to perform well in different scenarios on [...] Read more.
Cluster analysis seeks to assign objects with similar characteristics into groups called clusters so that objects within a group are similar to each other and dissimilar to objects in other groups. Spectral clustering has been shown to perform well in different scenarios on continuous data: it can detect convex and non-convex clusters, and can detect overlapping clusters. However, the constraint on continuous data can be limiting in real applications where data are often of mixed-type, i.e., data that contains both continuous and categorical features. This paper looks at extending spectral clustering to mixed-type data. The new method replaces the Euclidean-based similarity distance used in conventional spectral clustering with different dissimilarity measures for continuous and categorical variables. A global dissimilarity measure is than computed using a weighted sum, and a Gaussian kernel is used to convert the dissimilarity matrix into a similarity matrix. The new method includes an automatic tuning of the variable weight and kernel parameter. The performance of spectral clustering in different scenarios is compared with that of two state-of-the-art mixed-type data clustering methods, k-prototypes and KAMILA, using several simulated and real data sets. Full article
(This article belongs to the Special Issue Recent Developments in Clustering and Classification Methods)
Show Figures

Figure 1

15 pages, 913 KiB  
Article
Incorporating Clustering Techniques into GAMLSS
by Thiago G. Ramires, Luiz R. Nakamura, Ana J. Righetto, Andréa C. Konrath and Carlos A. B. Pereira
Stats 2021, 4(4), 916-930; https://doi.org/10.3390/stats4040053 - 12 Nov 2021
Cited by 3 | Viewed by 2782
Abstract
A method for statistical analysis of multimodal and/or highly distorted data is presented. The new methodology combines different clustering methods with the GAMLSS (generalized additive models for location, scale, and shape) framework, and is therefore called c-GAMLSS, for “clustering GAMLSS. ” In this [...] Read more.
A method for statistical analysis of multimodal and/or highly distorted data is presented. The new methodology combines different clustering methods with the GAMLSS (generalized additive models for location, scale, and shape) framework, and is therefore called c-GAMLSS, for “clustering GAMLSS. ” In this new extended structure, a latent variable (cluster) is created to explain the response-variable (target). Any and all parameters of the distribution for the response variable can also be modeled by functions of the new covariate added to other available resources (features). The method of selecting resources to be used is carried out in stages, a step-based method. A simulation study considering multiple scenarios is presented to compare the c-GAMLSS method with existing Gaussian mixture models. We show by means of four different data applications that in cases where other authentic explanatory variables are or are not available, the c-GAMLSS structure outperforms mixture models, some recently developed complex distributions, cluster-weighted models, and a mixture-of-experts model. Even though we use simple distributions in our examples, other more sophisticated distributions can be used to explain the response variable. Full article
(This article belongs to the Special Issue Recent Developments in Clustering and Classification Methods)
Show Figures

Figure 1

23 pages, 4313 KiB  
Article
Refined Mode-Clustering via the Gradient of Slope
by Kunhui Zhang and Yen-Chi Chen
Stats 2021, 4(2), 486-508; https://doi.org/10.3390/stats4020030 - 1 Jun 2021
Viewed by 2727
Abstract
In this paper, we propose a new clustering method inspired by mode-clustering that not only finds clusters, but also assigns each cluster with an attribute label. Clusters obtained from our method show connectivity of the underlying distribution. We also design a local two-sample [...] Read more.
In this paper, we propose a new clustering method inspired by mode-clustering that not only finds clusters, but also assigns each cluster with an attribute label. Clusters obtained from our method show connectivity of the underlying distribution. We also design a local two-sample test based on the clustering result that has more power than a conventional method. We apply our method to the Astronomy and GvHD data and show that our method finds meaningful clusters. We also derive the statistical and computational theory of our method. Full article
(This article belongs to the Special Issue Recent Developments in Clustering and Classification Methods)
Show Figures

Figure 1

26 pages, 5960 KiB  
Article
Unsupervised Feature Selection for Histogram-Valued Symbolic Data Using Hierarchical Conceptual Clustering
by Manabu Ichino, Kadri Umbleja and Hiroyuki Yaguchi
Stats 2021, 4(2), 359-384; https://doi.org/10.3390/stats4020024 - 18 May 2021
Cited by 3 | Viewed by 1948
Abstract
This paper presents an unsupervised feature selection method for multi-dimensional histogram-valued data. We define a multi-role measure, called the compactness, based on the concept size of given objects and/or clusters described using a fixed number of equal probability bin-rectangles. In each step of [...] Read more.
This paper presents an unsupervised feature selection method for multi-dimensional histogram-valued data. We define a multi-role measure, called the compactness, based on the concept size of given objects and/or clusters described using a fixed number of equal probability bin-rectangles. In each step of clustering, we agglomerate objects and/or clusters so as to minimize the compactness for the generated cluster. This means that the compactness plays the role of a similarity measure between objects and/or clusters to be merged. Minimizing the compactness is equivalent to maximizing the dis-similarity of the generated cluster, i.e., concept, against the whole concept in each step. In this sense, the compactness plays the role of cluster quality. We also show that the average compactness of each feature with respect to objects and/or clusters in several clustering steps is useful as a feature effectiveness criterion. Features having small average compactness are mutually covariate and are able to detect a geometrically thin structure embedded in the given multi-dimensional histogram-valued data. We obtain thorough understandings of the given data via visualization using dendrograms and scatter diagrams with respect to the selected informative features. We illustrate the effectiveness of the proposed method by using an artificial data set and real histogram-valued data sets. Full article
(This article belongs to the Special Issue Recent Developments in Clustering and Classification Methods)
Show Figures

Figure 1

Back to TopTop