Next Article in Journal
An Efficient Coding Technique for Stochastic Processes
Next Article in Special Issue
Detection of Internal Defects in Concrete and Evaluation of a Healthy Part of Concrete by Noncontact Acoustic Inspection Using Normalized Spectral Entropy and Normalized SSE
Previous Article in Journal
Steering Witness and Steering Criterion of Gaussian States
Previous Article in Special Issue
A Comparative Study of Functional Connectivity Measures for Brain Network Analysis in the Context of AD Detection with EEG
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Measuring Interactions in Categorical Datasets Using Multivariate Symmetrical Uncertainty

by
Santiago Gómez-Guerrero
1,*,
Inocencio Ortiz
1,
Gustavo Sosa-Cabrera
1,
Miguel García-Torres
2 and
Christian E. Schaerer
1
1
Polytechnic School, National University of Asuncion, San Lorenzo 2111, Paraguay
2
Data Science and Big Data Lab, Universidad Pablo de Olavide, ES-41013 Seville, Spain
*
Author to whom correspondence should be addressed.
Entropy 2022, 24(1), 64; https://doi.org/10.3390/e24010064
Submission received: 12 October 2021 / Revised: 29 November 2021 / Accepted: 1 December 2021 / Published: 30 December 2021
(This article belongs to the Special Issue Entropy: The Scientific Tool of the 21st Century)

Abstract

Interaction between variables is often found in statistical models, and it is usually expressed in the model as an additional term when the variables are numeric. However, when the variables are categorical (also known as nominal or qualitative) or mixed numerical-categorical, defining, detecting, and measuring interactions is not a simple task. In this work, based on an entropy-based correlation measure for n nominal variables (named as Multivariate Symmetrical Uncertainty (MSU)), we propose a formal and broader definition for the interaction of the variables. Two series of experiments are presented. In the first series, we observe that datasets where some record types or combinations of categories are absent, forming patterns of records, which often display interactions among their attributes. In the second series, the interaction/non-interaction behavior of a regression model (entirely built on continuous variables) gets successfully replicated under a discretized version of the dataset. It is shown that there is an interaction-wise correspondence between the continuous and the discretized versions of the dataset. Hence, we demonstrate that the proposed definition of interaction enabled by the MSU is a valuable tool for detecting and measuring interactions within linear and non-linear models.
Keywords: interaction; intrinsic interaction; categorical data; patterned data; multivariable correlation; gain in multiple correlation; multivariate symmetrical uncertainty interaction; intrinsic interaction; categorical data; patterned data; multivariable correlation; gain in multiple correlation; multivariate symmetrical uncertainty

Share and Cite

MDPI and ACS Style

Gómez-Guerrero, S.; Ortiz, I.; Sosa-Cabrera, G.; García-Torres, M.; Schaerer, C.E. Measuring Interactions in Categorical Datasets Using Multivariate Symmetrical Uncertainty. Entropy 2022, 24, 64. https://doi.org/10.3390/e24010064

AMA Style

Gómez-Guerrero S, Ortiz I, Sosa-Cabrera G, García-Torres M, Schaerer CE. Measuring Interactions in Categorical Datasets Using Multivariate Symmetrical Uncertainty. Entropy. 2022; 24(1):64. https://doi.org/10.3390/e24010064

Chicago/Turabian Style

Gómez-Guerrero, Santiago, Inocencio Ortiz, Gustavo Sosa-Cabrera, Miguel García-Torres, and Christian E. Schaerer. 2022. "Measuring Interactions in Categorical Datasets Using Multivariate Symmetrical Uncertainty" Entropy 24, no. 1: 64. https://doi.org/10.3390/e24010064

APA Style

Gómez-Guerrero, S., Ortiz, I., Sosa-Cabrera, G., García-Torres, M., & Schaerer, C. E. (2022). Measuring Interactions in Categorical Datasets Using Multivariate Symmetrical Uncertainty. Entropy, 24(1), 64. https://doi.org/10.3390/e24010064

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop