Next Article in Journal
Numerical Investigation on the Thermal Performance of Nanofluid-Based Cooling System for Synchronous Generators
Previous Article in Journal
Role of Quantum Entropy and Establishment of H-Theorems in the Presence of Graviton Sinks for Manifestly-Covariant Quantum Gravity
Article Menu
Issue 4 (April) cover image

Export Article

Open AccessArticle

Entropic Statistical Description of Big Data Quality in Hotel Customer Relationship Management

1
Department of Business and Management, Rey Juan Carlos University, 28943 Madrid, Spain
2
Department of Theory and Comunications, Telematics and Computing Systems, Rey Juan Carlos University, 28943 Madrid, Spain
*
Author to whom correspondence should be addressed.
Entropy 2019, 21(4), 419; https://doi.org/10.3390/e21040419
Received: 23 February 2019 / Revised: 7 April 2019 / Accepted: 17 April 2019 / Published: 19 April 2019
(This article belongs to the Section Signal and Data Analysis)
  |  
PDF [1493 KB, uploaded 25 April 2019]
  |  

Abstract

Customer Relationship Management (CRM) is a fundamental tool in the hospitality industry nowadays, which can be seen as a big-data scenario due to the large amount of recordings which are annually handled by managers. Data quality is crucial for the success of these systems, and one of the main issues to be solved by businesses in general and by hospitality businesses in particular in this setting is the identification of duplicated customers, which has not received much attention in recent literature, probably and partly because it is not an easy-to-state problem in statistical terms. In the present work, we address the problem statement of duplicated customer identification as a large-scale data analysis, and we propose and benchmark a general-purpose solution for it. Our system consists of four basic elements: (a) A generic feature representation for the customer fields in a simple table-shape database; (b) An efficient distance for comparison among feature values, in terms of the Wagner-Fischer algorithm to calculate the Levenshtein distance; (c) A big-data implementation using basic map-reduce techniques to readily support the comparison of strategies; (d) An X-from-M criterion to identify those possible neighbors to a duplicated-customer candidate. We analyze the mass density function of the distances in the CRM text-based fields and characterized their behavior and consistency in terms of the entropy and of the mutual information for these fields. Our experiments in a large CRM from a multinational hospitality chain show that the distance distributions are statistically consistent for each feature, and that neighbourhood thresholds are automatically adjusted by the system at a first step and they can be subsequently more-finely tuned according to the manager experience. The entropy distributions for the different variables, as well as the mutual information between pairs, are characterized by multimodal profiles, where a wide gap between close and far fields is often present. This motivates the proposal of the so-called X-from-M strategy, which is shown to be computationally affordable, and can provide the expert with a reduced number of duplicated candidates to supervise, with low X values being enough to warrant the sensitivity required at the automatic detection stage. The proposed system again encourages and supports the benefits of big-data technologies in CRM scenarios for hotel chains, and rather than the use of ad-hoc heuristic rules, it promotes the research and development of theoretically principled approaches. View Full-Text
Keywords: Customer Relationship Management; hospitality industry; big data; duplicate detection; name matching; Levenshtein distance; X-from-M strategy; entropy; mutual information; mass density function Customer Relationship Management; hospitality industry; big data; duplicate detection; name matching; Levenshtein distance; X-from-M strategy; entropy; mutual information; mass density function
Figures

Figure 1

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).
SciFeed

Share & Cite This Article

MDPI and ACS Style

González-Serrano, L.; Talón-Ballestero, P.; Muñoz-Romero, S.; Soguero-Ruiz, C.; Rojo-Álvarez, J.L. Entropic Statistical Description of Big Data Quality in Hotel Customer Relationship Management. Entropy 2019, 21, 419.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Entropy EISSN 1099-4300 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top