Extracting Knowledge from the Geometric Shape of Social Network Data Using Topological Data Analysis
AbstractTopological data analysis is a noble approach to extract meaningful information from high-dimensional data and is robust to noise. It is based on topology, which aims to study the geometric shape of data. In order to apply topological data analysis, an algorithm called mapper is adopted. The output from mapper is a simplicial complex that represents a set of connected clusters of data points. In this paper, we explore the feasibility of topological data analysis for mining social network data by addressing the problem of image popularity. We randomly crawl images from Instagram and analyze the effects of social context and image content on an image’s popularity using mapper. Mapper clusters the images using each feature, and the ratio of popularity in each cluster is computed to determine the clusters with a high or low possibility of popularity. Then, the popularity of images are predicted to evaluate the accuracy of topological data analysis. This approach is further compared with traditional clustering algorithms, including k-means and hierarchical clustering, in terms of accuracy, and the results show that topological data analysis outperforms the others. Moreover, topological data analysis provides meaningful information based on the connectivity between the clusters. View Full-Text
Share & Cite This Article
Almgren, K.; Kim, M.; Lee, J. Extracting Knowledge from the Geometric Shape of Social Network Data Using Topological Data Analysis. Entropy 2017, 19, 360.
Almgren K, Kim M, Lee J. Extracting Knowledge from the Geometric Shape of Social Network Data Using Topological Data Analysis. Entropy. 2017; 19(7):360.Chicago/Turabian Style
Almgren, Khaled; Kim, Minkyu; Lee, Jeongkyu. 2017. "Extracting Knowledge from the Geometric Shape of Social Network Data Using Topological Data Analysis." Entropy 19, no. 7: 360.
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.