Next Article in Journal
Effect of Elevated Temperature on Physical Activity and Falls in Low-Income Older Adults Using Zero-Inflated Poisson and Graphical Models
Previous Article in Journal
Evaluating Translation Quality: A Qualitative and Quantitative Assessment of Machine and LLM-Driven Arabic–English Translations
Previous Article in Special Issue
Amazon Web Service–Google Cross-Cloud Platform for Machine Learning-Based Satellite Image Detection
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

Enhancing Customer Segmentation Through Factor Analysis of Mixed Data (FAMD)-Based Approach Using K-Means and Hierarchical Clustering Algorithms

by
Chukwutem Pinic Ufeli
1,
Mian Usman Sattar
1,
Raza Hasan
2,* and
Salman Mahmood
3
1
College of Science and Engineering, University of Derby, Kedleston Road, Derby DE22 1GB, UK
2
Department of Science and Engineering, Solent University, Southampton SO14 0YN, UK
3
Department of Computer Science, Nazeer Hussain University, ST-2, Near Karimabad, Karachi 75950, Pakistan
*
Author to whom correspondence should be addressed.
Information 2025, 16(6), 441; https://doi.org/10.3390/info16060441
Submission received: 11 April 2025 / Revised: 19 May 2025 / Accepted: 20 May 2025 / Published: 26 May 2025
(This article belongs to the Special Issue Real-World Applications of Machine Learning Techniques)

Abstract

In today’s data-driven business landscape, effective customer segmentation is crucial for enhancing engagement, loyalty, and profitability. Traditional clustering methods often struggle with datasets containing both numerical and categorical variables, leading to suboptimal segmentation. This study addresses this limitation by introducing a novel application of Factor Analysis of Mixed Data (FAMD) for dimensionality reduction, integrated with K-means and Agglomerative Clustering for robust customer segmentation. While FAMD is not new in data analytics, its potential in customer segmentation has been underexplored. This research bridges that gap by demonstrating how FAMD can harmonize mixed data types, preserving structural relationships that conventional methods overlook. The proposed methodology was tested on a Kaggle-sourced retail dataset comprising 3900 customers, with preprocessing steps including correlation ratio filtering (η ≥ 0.03), standardization, and encoding. FAMD reduced the feature space to three principal components, capturing 81.46% of the variance, which facilitated clearer segmentation. Comparative clustering analysis showed that Agglomerative Clustering (Silhouette Score: 0.52) outperformed K-means (0.51) at k = 4, revealing distinct customer segments such as seasonal shoppers and high spenders. Practical implications include the development of targeted marketing strategies, validated through heatmap visualizations and cluster profiling. This study not only underscores the suitability of FAMD for customer segmentation but also sets the stage for more nuanced marketing analytics driven by mixed-data methodologies.
Keywords: customer segmentation; FAMD; K-means; agglomerative clustering; silhouette score; mixed data analysis customer segmentation; FAMD; K-means; agglomerative clustering; silhouette score; mixed data analysis

Share and Cite

MDPI and ACS Style

Ufeli, C.P.; Sattar, M.U.; Hasan, R.; Mahmood, S. Enhancing Customer Segmentation Through Factor Analysis of Mixed Data (FAMD)-Based Approach Using K-Means and Hierarchical Clustering Algorithms. Information 2025, 16, 441. https://doi.org/10.3390/info16060441

AMA Style

Ufeli CP, Sattar MU, Hasan R, Mahmood S. Enhancing Customer Segmentation Through Factor Analysis of Mixed Data (FAMD)-Based Approach Using K-Means and Hierarchical Clustering Algorithms. Information. 2025; 16(6):441. https://doi.org/10.3390/info16060441

Chicago/Turabian Style

Ufeli, Chukwutem Pinic, Mian Usman Sattar, Raza Hasan, and Salman Mahmood. 2025. "Enhancing Customer Segmentation Through Factor Analysis of Mixed Data (FAMD)-Based Approach Using K-Means and Hierarchical Clustering Algorithms" Information 16, no. 6: 441. https://doi.org/10.3390/info16060441

APA Style

Ufeli, C. P., Sattar, M. U., Hasan, R., & Mahmood, S. (2025). Enhancing Customer Segmentation Through Factor Analysis of Mixed Data (FAMD)-Based Approach Using K-Means and Hierarchical Clustering Algorithms. Information, 16(6), 441. https://doi.org/10.3390/info16060441

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop